G G
G G

Reputation: 33

Why Python 3 pickle cannot read Python 2 pickle data?

I have a Python 2 pickle file that when I try to read it with Python 3 it shows the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

Here are some code sample in Python 2 and Python 3:

python_2_dump.py

# -*- coding: utf-8 -*-
# Python 2 version
import cPickle

test = {
  'Á': 'A',
  'á': 'a',
  'Ã': 'A',
  'ã': 'a',
  'Â': 'A',
  'â': 'a',
}

with open('test.pickle', 'w') as f:
  cPickle.dump(test, f)

python_3_load.py

# Python 3 version
import pickle

with open('test.pickle', 'rb') as f:
  print(pickle.load(f))

Is there any reason Python 3 doesn't detect the old protocol and convert it accordingly? If it was the other way around, i.e. Python 2 reading a Python 3 pickle data, it makes sense.

Upvotes: 2

Views: 523

Answers (1)

buran
buran

Reputation: 14233

The protocol is detected automatically, as stated in the docs:

The protocol version of the pickle is detected automatically, so no protocol argument is needed.

However, you need to use fix_imports, encoding and errors to control compatibility support for pickle stream generated by Python 2. The relevant docs:

The optional arguments fix_imports, encoding and errors are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects. Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2.

In your example, it will read the test.pickle if you pass encoding='utf-8':

print(pickle.load(f, encoding='utf-8'))

output:

{'Ã': 'A', 'â': 'a', 'Á': 'A', 'ã': 'a', 'Â': 'A', 'á': 'a'}

Upvotes: 3

Related Questions