Novak
Novak

Reputation: 4783

Python 3 pickle load from Python 2

I have a pickle file that was created (I don't know how exactly) in python 2. It is intended to be loaded by the following python 2 lines, which when used in python 3 (unsurprisingly) do not work:

with open('filename','r') as f:
    foo, bar = pickle.load(f)

Result:

'ascii' codec can't decode byte 0xc2 in position 1219: ordinal not in range(128)

Manual inspection of the file indicates it is utf-8 encoded, therefore:

with open('filename','r', encoding='utf-8') as f:
    foo, bar = pickle.load(f)

Result:

TypeError: a bytes-like object is required, not 'str'

With binary encoding:

with open('filename','rb', encoding='utf-8') as f:
    foo, bar = pickle.load(f)

Result:

ValueError: binary mode doesn't take an encoding argument

Without binary encoding:

with open('filename','rb') as f:
    foo, bar = pickle.load(f)

Result:

UnpicklingError: invalid load key, ' '.

Is this pickle file just broken? If not, how can I pry this thing open in python 3? (I have browsed the extensive collection of related questions and not found anything that works yet.)

Finally, note that the original

import cPickle as pickle

has been replaced with

import _pickle as pickle

Upvotes: 6

Views: 9134

Answers (2)

Salvatore Cosentino
Salvatore Cosentino

Reputation: 7230

The loading of python2 pickles in python3 (version 3.7.2 in this example) can be helped using the fix_imports parameter in the pickle.load function, but in my case it also worked without setting that parameter to True.

I was attempting to load a scipy.sparse.csr.csr_matrix contained in pickle generated using Python2.

When inspecting the file format using the UNIX command file it says:

>file -bi python2_generated.pckl
application/octet-stream; charset=binary

I could load the pickle in Python3 using the following code:

with open("python2_generated.pckl", "rb") as fd:
    bh01 = pickle.load(fd, fix_imports=True, encoding="latin1")

Note that the loading was successful with and without setting fix_imports to True As for the "latin1" encoding, the Python3 documentation (version 3.7.2) for the pickle.load function says: Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2

Although this is specifically for scipy matrixes (or Numpy arrays), and since Novak is not clarifing what his pickle file contained, I hope this could of help to other users :)

Upvotes: 3

Novak
Novak

Reputation: 4783

Two errors were conflating each other.

First: By the time the .p file reached me, it had almost certainly been corrupted in transit, likely by FTP-ing (or similar) in ASCII rather than binary mode. I was able to get my hands on a properly transmitted copy, which allowed me to discover...

Second: Whatever the file might have implied on the inside, the proper encoding was 'latin1' not 'utf-8'.

So in a sense, yes, the file was broken, and even after that I was doing it wrong. I leave this here as a reminder to whoever eventually has the next bizarre pickle/python2/python3 issue that there can be multiple things gone wrong, and they have to be solved in the correct orderr.

Upvotes: 2

Related Questions