[PYTHON] Filter the output of tracemalloc

Currently, there is a standard module called tracemalloc as a module for checking which variable allocates how much memory in a Python program.

If you use this, for example, you can check the memory release leak of a certain process by writing such code.

Code that calls tracemalloc directly


import tracemalloc

tracemalloc.start()
snap1 = tracemalloc.take_snapshot()
size1 = sum([stat.size for stat in snap1.statistics("filename")])
for stat in snap1.statistics('lineno')[:10]:
  print(stat)
# ...Some processing...
snap2 = tracemalloc.take_snapshot()
size2 = sum([stat.size for stat in snap2.statistics("filename")])
for stat in snap2.compare_to(snap1, 'lineno')[:10]:
  print(stat)
diff = abs(size1 - size2)
print("Amount of memory after processing{:,}Part-Time Job".format(diff))    

If you use the with syntax, it looks like this

Classified to use the with syntax


import tracemalloc

class MemCheck:
  """
A class for performing memory checks. Used in the with syntax.
A check is automatically performed when exiting with.
  """

  def __init__(self):
    """
Initialization
    """
    pass

  def __enter__(self):
    tracemalloc.start()
    self.snap = tracemalloc.take_snapshot()
    self.size = sum([stat.size for stat in self.snap.statistics("filename")])
    print("")
    print("-----TEST START!!!!-----")
    for stat in self.snap.statistics('lineno')[:10]:
      print(stat)
    return self

  def __exit__(self, ex_type, ex_value, trace):
    print("-----TEST END!!!!!!-----")
    snap = tracemalloc.take_snapshot()
    size = sum([stat.size for stat in snap.statistics("filename")])
    for stat in snap.compare_to(self.snap, 'lineno')[:10]:
      print(stat)
    diff = abs(self.size - size)
    print("Amount of memory after processing{:,}Part-Time Job".format(diff))
    print("-" * 20)
    return False

def main():
  with MemCheck():
    # ...Some processing...

However, if you actually move this, the objects that use memory will include the inside of the tracemalloc module and the object for the operation of the debugger, so it is a little troublesome to use it for testing memory release leaks.

To that end, there is a method called filter_traces () for filtering the data.

However, if you look at the document, only two examples are written, so it corresponds to the situation that "I want to completely ignore the object for debugging" or conversely "I want to test only the code I wrote" can not.

What should I do when I specify a folder name and say "I want to trace only the files under this folder"?

do this

You can write it as follows.

An example that uses the with syntax with a filter added


import tracemalloc
from pathlib import Path

class MemCheck:
  """
A class for performing memory checks. Used in the with syntax.
A check is automatically performed when exiting with.
  """

  def __init__(self):
    """
Initialization
    """
    pass

  def __enter__(self):
    tracemalloc.start()
    self.snap = tracemalloc.take_snapshot().filter_traces(self.get_filter_traces())
    self.size = sum([stat.size for stat in self.snap.statistics("filename")])
    print("")
    print("-----TEST START!!!!-----")
    for stat in self.snap.statistics('lineno')[:10]:
      print(stat)
    return self

  def __exit__(self, ex_type, ex_value, trace):
    print("-----TEST END!!!!!!-----")
    snap = tracemalloc.take_snapshot().filter_traces(self.get_filter_traces())
    size = sum([stat.size for stat in snap.statistics("filename")])
    for stat in snap.compare_to(self.snap, 'lineno')[:10]:
      print(stat)
    diff = abs(self.size - size)
    print("Amount of memory after processing{:,}Part-Time Job".format(diff))
    print("-" * 20)
    return False

  #add to
  def get_filter_traces(self):
    return (
      tracemalloc.Filter(True, str(Path(__file__).parent.parent / ".venv" / "lib" / "site-packages" / "*")),
      tracemalloc.Filter(True, str(Path(__file__).parent.parent / "src" / "*")),
    )

def main():
  with MemCheck():
    # ...Some processing...

The get_filter_traces () method in the code is the method that returns the filter list for the filter_traces () method. I do this because I want the filter to be applied to all snapshot acquisition processes.

The argument of the filter_traces () method receives a tuple of the tracemalloc.Filter object, and the tracemalloc.Filter object has the first argument of the constructor that "displays a match for that filter". Specify "(True)" or "Display inconsistencies (False)", and specify the file name to be specified in the filter in the second argument (filename_pattern).

Looking at this file name and document, it says "File name pattern of the filter (str). Read-only property." So, at first glance, it's okay to insert a regular expression, but instead, it's a module called ** fnmatch. Specifies a Shell-formatted wildcard string that can be processed) **

If you are accustomed to using pattern matching such as regular expressions, you may think that pattern = regular expression, but please note that it is not a regular expression. If so, I wanted you to write a comment that would make you understand that much ...

If only the files under the src folder are targeted for tracing, nothing may appear.

If it is not the program itself that you wrote but the object that you called in the program that you wrote ** that allocated the memory, write it yourself with tracemalloc.take_snapshot (). It is not the line number of the source file that was used, but the line number of the source file that defines the called object ** (I called PyPDF.PdfFileReader with my program, and the memory read there was not released. In the case, "Memory is allocated in the files under\ lib \ site-packages \ PyPDF2 \" is displayed).

Therefore, when creating a filter, not only "under the src folder" but also "the folder of the calling module" as in the above example should be included in the trace target. If the virtual environment folder is created in the project, it is OK if the search target is around .venv \ lib \ site-packages \ * as described above.

How did you notice

I opened the C: \ PythonNN \ Lib \ tracemalloc.py file and followed the Filter process (bar).

I want you to forgive the behavior that you can't understand without doing this ...

Recommended Posts

Filter the output of tracemalloc
Output the number of CPU cores in Python
Setting to output the log of cron execution
Read the output of subprocess.Popen in real time
Output in the form of a python array
I tried using the image filter of OpenCV
The beginning of cif2cell
the zen of Python
The story of sys.path.append ()
Basics of python: Output
Revenge of the Types: Revenge of types
Gradually display the output of the command executed by subprocess.Popen
[python] option to turn off the output of click.progressbar
I checked the output specifications of PyTorch's Bidirectional LSTM
Output the output result of sklearn.metrics.classification_report as a CSV file
Get the output value of the command (as received by xargs)
Scraping the result of "Schedule-kun"
The story of building Zabbix 4.4
Save the output of GAN one by one ~ With the implementation of GAN by PyTorch ~
Towards the retirement of Python2
[Apache] The story of prefork
Compare the fonts of jupyter-themes
About the ease of Python
Explain the code of Tensorflow_in_ROS
Operation of filter (None, list)
Reuse the results of clustering
GoPiGo3 of the old man
Calculate the number of changes
The popularity of programming languages
Change the style of matplotlib
Visualize the orbit of Hayabusa2
About the components of Luigi
Connected components of the graph
Debug output of chalice command
Keras I want to get the output of any layer !!
About HOG output of Scikit-Image
About the features of Python
Output the result of gradient descent method as matplotlib animation
Simulation of the contents of the wallet
The Power of Pandas: Python
I want to output the beginning of the next month with Python
Read the standard output of a subprocess line by line in Python
How to output the output result of the Linux man command to a file
Amazon Rekognition Filter function when registering faces ・ Limiting the number of faces
Understand the number of input / output parameters of a convolutional neural network
Output the line containing the specified string
The specifications of pytz have changed
Find the definition of the value of errno
The day of docker run (note)
Plot the spread of the new coronavirus
The story of Python and the story of NaN
Raise the version of pyenv itself
Get the number of views of Qiita
[Python] The stumbling block of import
First Python 3 ~ The beginning of repetition ~
Japanese translation of the e2fsprogs manual
Change the background of Ubuntu (GNOME)
Is the probability of precipitation correct?
Implementation of a simple particle filter
I investigated the mechanism of flask-login!
Understand the contents of sklearn's pipeline