[PYTHON] pickle To read what was made in 2 series with 3 series

I want to pickle.dump the 2nd system and pickle.load the 3rd system.

Various specifications are listed: http://docs.python.jp/3.4/library/pickle.html

A little after moving to the 3rd system, I wanted to read the data dumped by the previously created pickle, but an error occurred.

error

pickle-load-error.


# test_w_2.pkl is 2 system[1]File dumped
In [25]: fin = open('test_w_2.pkl', 'r') 
In [26]: pickle.load(fin)    
TypeError: a bytes-like object is required, not 'str'       

I get an error like this, isn't it str? It seems that I have to make it bytes

Solution

It seems that you should add'rb' to the option at the time of ʻopen` (binary mode)

piclkle-load.


In [38]: fin = open('test_wb_2.pkl', 'rb')
In [39]: pickle.load(fin)                                                                              
Out[39]: [1]   

[Addition] When dumped is string

In the above example, only [1] was dumped. It seems that it may not be possible to load if the list contains a character string. When doing pickle.load, change the encoding method with ʻencoding ='bytes'. It is necessary to convert the contents of the received list to string using .decode ('utf8') `.

python2.7.9

pickle.dump


>>> pickle.dump(['AIUEO'], open('test.pkl', 'w')  )

python3.5.0

pickle.load


#Read normally and error
>>> pickle.load(open('test.pkl', 'r') )                                                                
Traceback (most recent call last):                                                                     
  File "<stdin>", line 1, in <module>                                                                  
TypeError: a bytes-like object is required, not 'str'                                                  

#Even if you read it in binary'ascii'Error because it does not correspond to
>>> pickle.load(open('test.pkl', 'rb') )                                                               
Traceback (most recent call last):                                                                     
  File "<stdin>", line 1, in <module>                                                                  
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 0: ordinal not in range(128)      

#I could read it after changing the encoding method, but I can't read it as bytes.
>>> pickle.load(open('test.pkl', 'rb'), encoding='bytes' )                                             
 [b'\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a']                                      

# decode('utf8')Use to decode each element in the list
>>> list(map(lambda x: x.decode('utf8'), pickle.load(open('test.pkl', 'rb'), encoding='bytes' ) ) )    
 ['AIUEO']  

[Addition] On the contrary, to read what was made in 3 series in 2 series

If it is normally made with 3 series (dump), it cannot be read (load) with 2 series It is written in the URL I wrote at the beginning, but it seems that you can specify protocol. Below is a description of each protocol version.

-Protocol version 0 is the original "human readable" protocol, backwards compatible with earlier versions of Python. --Protocol version 1 is an older binary format, which is also compatible with earlier versions of Python. --Protocol version 2 was introduced in Python 2.3. This version provides a more efficient pickle version of the new method of class. See PEP 307 for information on improvements made with Protocol 2. --Protocol version 3 was added in Python 3.0. This version supports bytes objects. It cannot be non-pickle in Python 2.x. This is the default protocol and is the recommended protocol if compatibility with other Python 3 versions is required. --Protocol version 4 was added in Python 3.4. This version supports huge objects, pickles more types of objects, and optimizes some data formats. See PEP 3154 for information on improvements made with Protocol 4.

In python2 series, as a matter of course, there is only up to 2 protocol, so it seems better to match this with python3 as well. (Because people around me don't use python2 system very much) As far as I can see, 4 seems to be the best, so it seems better to specify this when only you use it. By the way, the default is set to 3.

python3.5

pickle.dump


>>> pickle.dump(['AIUEO'], open('test.pkl','wb'), protocol=2 ) 

python2.7

pickle.load


>>> pickle.load(open('test.pkl') ) 
[u'\u3042\u3044\u3046\u3048\u304a']  
>>> print(pickle.load(open('test.pkl') )[0] )
AIUEO

Summary

If you add'rb', it's quite so (for now) There are other options for fix_imports and ʻerrors`, but for now it works. I will add it when a scene that does not move appears.

Recommended Posts

pickle To read what was made in 2 series with 3 series
What I was addicted to with json.dumps in Python base64 encoding
How to read time series data in PyTorch
I was addicted to scraping with Selenium (+ Python) in 2020
What I did when I was angry to put it in with the enable-shared option
I made a package to filter time series with python
I made a library to easily read config files with Python
What to do with Magics install
Numer0n with items made in Python
I tried to summarize what was output with Qiita with Word cloud
What to do with PYTHON release?
Read files in parallel with Python
What to do if you get lost in file reference with FileNotFoundError
Environment maintenance made with Docker (I want to post-process GrADS in Python
What to do if you can't install with pip in babun environment
Upload what you got in request to S3 with AWS Lambda Python
How to deal with old Python versions in Cloud9 made by others
Try logging in to qiita with Python
How to work with BigQuery in Python
To work with timestamp stations in Python
What I was addicted to Python autorun
How to read CSV files in Pandas
Adding Series to columns in python pandas
How to read problem data with paiza
Read text in images with python OCR
Books on data science to read in 2020
What I was addicted to when creating a web application in a windows environment
A story that I was addicted to when I made SFTP communication with python
Convenient time series aggregation with TimeGrouper in pandas
How to deal with memory leaks in matplotlib.pyplot
Until you CI what you made with Django with Jenkins
How to read a CSV file with Python 2/3
Log in to the remote server with SSH
[REAPER] How to play with Reascript in Python
What to do if pipreqs results in UnicodeDecodeError
Convert PDFs to images in bulk with Python
I was able to recurse in Python: lambda
Summary of what was used in 100 Pandas knocks (# 1 ~ # 32)
How to use Python Image Library in python3 series
I tried to integrate with Keras in TFv1.1
[Python] How to read excel file with pandas
can't pickle annoy. How to deal with Annoy objects
Read table data in PDF file with Python
How to deal with run-time errors in subprocess.call
Read "Quantum computer made in 14 days". Third day
Convert UTF-8 CSV files to read in Excel
Log in to Yahoo Business with Selenium Python
Easily log in to AWS with multiple accounts
How to use tkinter with python in pyenv
Save the object to a file with pickle
Read "Quantum computer made in 14 days". First day
What should I do with DICOM in MPEG2?
What to do to get google spreadsheet in python
How to read a file in a different directory
Materials to read when getting started with Python
What I was addicted to when combining class inheritance and Joint Table Inheritance in SQLAlchemy
Compress variables such as DataFrame with joblib instead of pickle to read and write
What to do if you run python in IntelliJ and end with an error