I'm using Jupyter Lab, and after defining variables, del and gc.collect () doesn't free memory.
Write the following code in one cell and execute it.
import numpy as np
import pandas as pd
import psutil
print(psutil.Process().memory_info().rss / 1024**2)
df = pd.DataFrame(np.arange(50000000).reshape(-1,5))
df.head()
Note that it is df.head () at the end.
In the next cell
print(psutil.Process().memory_info().rss / 1024**2)
del df
print(psutil.Process().memory_info().rss / 1024**2)
The memory status is not displayed immediately, so check it in the next cell just in case.
print(psutil.Process().memory_info().rss / 1024**2)
I thought it was a problem with pandas.DataFrame, but it's not. When the output is displayed by the Jupyter Lab function at the end of the "cell", it seems that it will not be released even if you del in another cell after that.
Do not df.head () at the end of the cell. Print if you want to display it.
ββIt's a part / behavior that makes you feel uneasy, but is it something you are careful about when you get started with Jupyter properly (I don't know). ββI was able to investigate and confirm it at hand, but the reason why it is not subject to gc is whether the reference is retained in the screen display function of jupyter or something. It seems that IPython is the cause. If the search term is Jupyter memory leak, https://github.com/jupyter/notebook/issues/3713 will appear, and it seems that there is an explanation in the future (I have not seen it). --If the search term is pandas memory leak, https://stackoverflow.com/questions/14224068/memory-leak-using-pandas-dataframe etc. will appear and you will continue to get lost (get lost).
Recommended Posts