About Python pickle (cPickle) and marshal

Introduction

Python has a serialization module for objects called pickle (cPickle). I think pickle is very famous. In fact, there is also a similar module called marshal. This is not very famous. It's not a persistent module, and compatibility between versions isn't guaranteed, so there's no reason to use marshal. Not surprisingly, it's not famous.

Marshal is full of drawbacks compared to pickle, but I found that marshal also has some good points, so this time I will focus on that. The good thing about marshal is "speed".

Check speed

Let's check the export and import speeds immediately. This time, the speeds were compared between "cPickle" and "marshal".

First, write it out.

#!/usr/bin/env python
# -- coding:utf-8 -*-
import marshal
import cPickle as pickle
import time
import numpy as np


def main():
    a = np.ndarray(shape=(10000, 10000))
    start = time.time()
    pickle.dump(a, open('output.pkl', 'wb'))
    p_time = time.time() - start
    start = time.time()
    marshal.dump(a, open('output.msl', 'wb'))
    m_time = time.time() - start
    print p_time, m_time

if __name__ == '__main__':
    main()

The code looks like this. I'm generating a multidimensional array and writing it out.

Let's take a look at the output.

143.123441935 5.09839010239

The left is cPickle, the right is marshal, and the unit is seconds. It made a difference more than I expected, it was unexpected.

Next is reading.

#!/usr/bin/env python
# -- coding:utf-8 -*-
import marshal
import cPickle as pickle
import time
import numpy as np


def main():
    start = time.time()
    a = pickle.load(open('output.pkl', 'rb'))
    p_time = time.time() - start
    start = time.time()
    b = marshal.load(open('output.msl', 'rb'))
    m_time = time.time() - start
    print p_time, m_time

if __name__ == '__main__':
    main()

The code is this. I just set dump to load.

Click here for results

445.698551893 1.64994597435

Similarly, the left is cPickle, the right is marshal, and the unit is seconds. Well, it's a huge difference ... I'm pretty surprised.

Summary

Yes. So it turns out that marshal is faster in speed. I haven't investigated the reason why.

However, although marshal is superior in speed, pickle that makes it permanent is overwhelmingly convenient, so if you use it, it is still pickle. Then why did you compare them? The only reason is that I was curious.

So I spotlighted marshal, which I would rarely use! That is all for the story.

Recommended Posts

About Python pickle (cPickle) and marshal
About python objects and classes
About Python variables and objects
About Python, len () and randint ()
About Python datetime and timezone
About Python and regular expressions
About Python and os operations
Python # About reference and copy
About Python sort () and reverse ()
About installing Pwntools and Python2 series
About python dict and sorted functions
[Python] About Executor and Future classes
About Python, from and import, as
About _ and __
A story about Python pop and append
Talking about old and new Python classes
Talking about Python class attributes and metaclasses
About python slices
About python comprehension
About Python tqdm.
About python yield
About python, class
About python inheritance
About python, range ()
About python decorators
About python reference
About Python decorators
[Python] About multi-process
Think about depth-priority and width-priority searches in Python
About the difference between "==" and "is" in python
A story about modifying Python and adding functions
[Python] Learn about asynchronous programming and event loops
About the * (asterisk) argument of python (and itertools.starmap)
About shallow and deep copies of Python / Ruby
[python] Compress and decompress
About Python for loops
Getting Started with python3 # 2 Learn about types and variables
About Class and Instance
Summary about Python scraping
Python and numpy tips
[Python] pip and wheel
Batch design and python
Python iterators and generators
Python packages and modules
Vue-Cli and Python integration
[Python] Memo about functions
Summary about Python3 + OpenCV3
About cumprod and cummax
About Python, for ~ (range)
About Python3 character code
python input and output
[Python] Memo about errors
Python and Ruby split
About Python development environment
Python: About function arguments
Python Pickle format notes
Python, about exception handling
About Python Pyramid traversal
About creating and modifying custom themes for Python IDLE
About Python3 ... (Ellipsis object)
[Python] Chapter 01-01 About Python (First Python)