I usually do ** data analysis and model building ** mainly on Jupyter. Make a note of how many times you check it. (Updated from time to time)
Modules that have already been imported will be cached or will not be updated even if they are imported again normally. ** You can reload (reimport) with the following. ** **
import importlib
importlib.reload(hoge)
#hoge is an imported module
I didn't know this until recently, and I restarted it every time, so it's really eye-opening.
Another option is How to auto-update modules with % autoreload
.
If you have multiple servers running, you often don't know which tab is the notebook on which server. You can change the tab name as follows.
Do the following within Jupyter
%%javascript
document.title='Jupyter-GPU'
Alternatively, it can be specified at build time.
jupyter lab build --name='Jupyter-GPU'
Reference: https://github.com/jupyterlab/jupyterlab/issues/4422#issuecomment-395962448
If you are not particular about fonts, ** japanize-matplotlib is the quickest way. ** **
pip install japanize-matplotlib
import matplotlib.pyplot as plt
import japanize_matplotlib
plt.plot([1, 2, 3, 4])
plt.xlabel('The joy of using Japanese easily')
plt.show()
There are two timings to start the debugger.
** Specify breakpoint and start debugger ** Insert the code below
from IPython.core.debugger import Pdb; Pdb().set_trace()
** Debugger starts when a bug occurs ** Applies only to specific cells
#Put it at the beginning of the cell you want to debug
%%debug
Applies to the entire notebook
#Put it somewhere in your notebook
%pdb on
#This is when you want to turn off the bug detection mode
%pdb off
I wonder if I can go with this once
from tqdm.auto import tqdm
import numpy as np
#Enclose in tqdm
for i in tqdm(np.arange(1, 100000, 1)):
#Process here
pass
import pandas as pd
import numpy as np
from tqdm.auto import tqdm
# set description
tqdm.pandas(desc="Do this")
# apply
df = pd.DataFrame({'hoge': np.arange(1, 100000, 1)})
df['hoge'] = df['hoge'].progress_apply(lambda x: x + 1)
Increase the number of items that can be displayed and the maximum number of characters that can be displayed in one cell.
import pandas as pd
pd.set_option("display.max_colwidth", 500) #500 characters in 1 cell
pd.set_option("display.max_rows", 100) #Can display 100 lines
** * If set_option does not work in JupyterLab, it works well if you display the records below max_rows
likedf [: 100]
(from my personal experience) **
If you want to prevent omission of only a specific cell, do the following (@chik_taks told me!)
with pd.option_context('display.max_colwidth', 200):
display(df)
You can output DataFrame as markdown and copy it. I use it soberly
pip install pytablewriter
import pytablewriter
writer = pytablewriter.MarkdownTableWriter()
writer.from_dataframe(df)
writer.write_table()
# | col1 | col2 |
# |------|--------|
# |hoge1 |line1 |
# |hoge2 |line2 |
Recommended Posts