jss367
jss367

Reputation: 5381

TypeError: a bytes-like object is required, not 'str' when opening Python 2 Pickle file in Python 3

I am trying to open a pickle file in Python 3 with code that worked in Python 2 but is now giving me an error. Here is the code:

with open(file, 'r') as f:
    d = pickle.load(f)

TypeError                                 Traceback (most recent call last)
<ipython-input-25-38f711abef06> in <module>()
      1 with open(file, 'r') as f:
----> 2     d = pickle.load(f)

TypeError: a bytes-like object is required, not 'str'

I saw on other SO answers that people had this problem when using open(file ,'rb') and switching to open(file ,'r') fixed it. If this helps, I tried open(file ,'rb') just to experiment and got the following error:

UnpicklingError                           Traceback (most recent call last)
<ipython-input-26-b77842748a06> in <module>()
      1 with open(file, 'rb') as f:
----> 2     d = pickle.load(f)

UnpicklingError: invalid load key, '\x0a'.

When I open the file with f = open(file, 'r') and the enter f I get:

<_io.TextIOWrapper name='D:/LargeDataSets/Enron/final_project_dataset.pkl' mode='r' encoding='cp1252'>

So I also tried:

with open(file, 'rb') as f:
    d = pickle.load(f, encoding='cp1252')

and got the same error as with using 'rb':

UnpicklingError                           Traceback (most recent call last)
<ipython-input-27-959b1b0496d0> in <module>()
      1 with open(file, 'rb') as f:
----> 2     d = pickle.load(f, encoding='cp1252')

UnpicklingError: invalid load key, '\x0a'.

Upvotes: 8

Views: 14996

Answers (3)

Sreeragh A R
Sreeragh A R

Reputation: 3021

Explanation for loading with encoding = bytes.

Assume you have a dictionary to be pickled in Python2

data_dict= {'key1': value1, 'key2': value2}
with open('pickledObj.pkl', 'wb') as outfile:
  pickle.dump(data_dict, outfile)

Unpickling in Python3

with open('pickledObj.pkl', 'rb') as f:
        data_dict = pickle.load(f, encoding='bytes')

Note: The keys of dictionary are not strings anymore. They are bytes.

data_dict['key1'] #result in KeyError

data_dict[b'key1'] #gives value1

or use

data_dict['key1'.encode('utf-8')] #gives value1

Upvotes: 15

jss367
jss367

Reputation: 5381

After digging through the raw file in Sublime, it looks like the file was not correctly pickled. The above code works perfectly on a different version of that file.

Upvotes: -1

metakermit
metakermit

Reputation: 22301

Yeah, there are some changes between the Python 2 and 3 pickle formats. If possible, I'd recommend creating the pickled data again using Python 3.

If that's not possible/easy, try playing with different encoding settings (did you try 'utf8'?) or reading the data in with encoding='bytes' as mentioned here and then decoding the strings in your code where you can inspect the object further.

Upvotes: 2

Related Questions