Reputation: 5381
I am trying to open a pickle file in Python 3 with code that worked in Python 2 but is now giving me an error. Here is the code:
with open(file, 'r') as f:
d = pickle.load(f)
TypeError Traceback (most recent call last)
<ipython-input-25-38f711abef06> in <module>()
1 with open(file, 'r') as f:
----> 2 d = pickle.load(f)
TypeError: a bytes-like object is required, not 'str'
I saw on other SO answers that people had this problem when using open(file ,'rb')
and switching to open(file ,'r')
fixed it. If this helps, I tried open(file ,'rb')
just to experiment and got the following error:
UnpicklingError Traceback (most recent call last)
<ipython-input-26-b77842748a06> in <module>()
1 with open(file, 'rb') as f:
----> 2 d = pickle.load(f)
UnpicklingError: invalid load key, '\x0a'.
When I open the file with f = open(file, 'r')
and the enter f
I get:
<_io.TextIOWrapper name='D:/LargeDataSets/Enron/final_project_dataset.pkl' mode='r' encoding='cp1252'>
So I also tried:
with open(file, 'rb') as f:
d = pickle.load(f, encoding='cp1252')
and got the same error as with using 'rb':
UnpicklingError Traceback (most recent call last)
<ipython-input-27-959b1b0496d0> in <module>()
1 with open(file, 'rb') as f:
----> 2 d = pickle.load(f, encoding='cp1252')
UnpicklingError: invalid load key, '\x0a'.
Upvotes: 8
Views: 14996
Reputation: 3021
Explanation for loading with encoding = bytes.
Assume you have a dictionary to be pickled in Python2
data_dict= {'key1': value1, 'key2': value2}
with open('pickledObj.pkl', 'wb') as outfile:
pickle.dump(data_dict, outfile)
Unpickling in Python3
with open('pickledObj.pkl', 'rb') as f:
data_dict = pickle.load(f, encoding='bytes')
Note: The keys of dictionary are not strings anymore. They are bytes.
data_dict['key1'] #result in KeyError
data_dict[b'key1'] #gives value1
or use
data_dict['key1'.encode('utf-8')] #gives value1
Upvotes: 15
Reputation: 5381
After digging through the raw file in Sublime, it looks like the file was not correctly pickled. The above code works perfectly on a different version of that file.
Upvotes: -1
Reputation: 22301
Yeah, there are some changes between the Python 2 and 3 pickle formats. If possible, I'd recommend creating the pickled data again using Python 3.
If that's not possible/easy, try playing with different encoding settings (did you try 'utf8'
?) or reading the data in with encoding='bytes'
as mentioned here and then decoding the strings in your code where you can inspect the object further.
Upvotes: 2