JoaoAlby
JoaoAlby

Reputation: 157

Python 2 vs Python 3 string pickling

I'm getting different outputs pickling a string when using Python 2 and Python 3 (due to different str types I suppose).

Python 2:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609]
>>> import pickle, base64
>>> a = pickle.dumps('test')
>>> base64.b64encode(a)
'Uyd0ZXN0JwpwMAou'

Python 3:

Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609]
>>> import pickle, base64
>>> a = pickle.dumps('test')
>>> base64.b64encode(a)
b'gANYBAAAAHRlc3RxAC4='

How can I modify the code to get the same results when pickling a string?

EDIT:

When using protocol=2 still getting different pickles:

# Python 2
>>> base64.b64encode(pickle.dumps('test', protocol=2))
'gAJVBHRlc3RxAC4='

# Python 3
>>> base64.b64encode(pickle.dumps('test', protocol=2))
b'gAJYBAAAAHRlc3RxAC4='

Upvotes: 1

Views: 2022

Answers (1)

9000
9000

Reputation: 40894

Python can use different stream versions when pickling. Default versions differ between Python 2 and Python 3.

Pass the protocol version explicitly. Use pickle.dumps('test', protocol=2) to get consistent results across versions.

Note: The exact output may change, but the unpickling result remains the same, modulo "unicode" vs "ascii" in Python 2:

# Python 2.7 output:
>>> base64.b64encode(pickle.dumps('test', protocol=2))
'gAJVBHRlc3RxAC4='
# Decode output from Python 3:
>>> pickle.loads(base64.b64decode('gAJYBAAAAHRlc3RxAC4='))
u'test'

# Python 3.6 output:
>>> base64.b64encode(pickle.dumps('test', protocol=2))
b'gAJYBAAAAHRlc3RxAC4='
# Decoding Python 2's output:
>>> pickle.loads(base64.b64decode('gAJVBHRlc3RxAC4='))
'test'  # Note, not u'test'.

Upvotes: 2

Related Questions