Reputation: 181
I am trying to load a pickled dictionary but keep getting attribute error such as this
TypeError: a bytes-like object is required, not '_io.BufferedReader'
Below is the code to read and write pickle object. I am dumping pickled object on a linux workstation with python 2.7.12. The data is transferred to Mac with python 3.6.4, where readTrueData() is executed resulting in the above error.
def readTrueData(name):
fName = str('trueData/'+name+'.pkl')
f = open(fName,'rb')
# print(f)
# print(type(f))
pC = pickle.loads(f)
return pC
def storeTrueData(atomicConfigs, name):
import quippy
storeDic = {}
#rangeKeys = len(atomicConfigs)
#print(rangeKeys)
qTrain = quippy.AtomsList(atomicConfigs)
print(len(qTrain))
rangeKeys = len(qTrain)
print(rangeKeys)
for i in range(rangeKeys):
#configConsidered = atomicConfigs[i]
trueForce = np.array(qTrain[i].force).T
storeDic[i] = trueForce
f = open("trueData/"+ name + ".pkl", "wb")
pickle.dump(storeDic, f)
f.close()
return None
Working on the suggestions mentioned in the comments, I changed my code as below
a.)pC = pickle.load(f)
b.) pC = pickle.loads(f.read())
In both the case I got the following error
UnicodeDecodeError: 'ascii' codec can't decode byte 0x87 in position 1: ordinal not in range(128)
Upvotes: 1
Views: 12144
Reputation: 3288
You need to be using pickle.load(...)
to read if using open
in that manner.
Source: https://docs.python.org/3/library/pickle.html
Upvotes: 2
Reputation: 155438
Your first problem is caused by a mismatch between the argument type and the chosen load*
method; loads
expects bytes
objects, load
expects the file object itself. Passing the file object to loads
is what caused your error.
Your other problem is due to the cross-version compatibility issue with numpy
and datetime
types; Python 2 pickles str
s with no specified encoding, but Python 3 must unpickle them with a known encoding (or 'bytes'
, to get raw bytes
rather than str
). For numpy
and datetime
types, you're required to pass encoding='latin-1'
:
Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects. Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2.
In any event, the fix is to change:
def readTrueData(name):
fName = str('trueData/'+name+'.pkl')
f = open(fName,'rb')
# print(f)
# print(type(f))
pC = pickle.loads(f)
return pC
to:
def readTrueData(name):
fName = str('trueData/'+name+'.pkl')
with open(fName, 'rb') as f: # with statement avoids file leak
# Match load with file object, and provide encoding for Py2 str
return pickle.load(f, encoding='latin-1')
For correctness and performance reasons, I'd also recommend changing pickle.dump(storeDic, f)
to pickle.dump(storeDic, f, protocol=2)
on the Python 2 machine, so the stream is generated with a more modern pickle protocol, one which can efficiently pickle numpy
arrays among other things. Protocol 0, the Python 2 default, can't use the top bit of each byte (it's ASCII compatible), which means raw binary data bloats dramatically in protocol 0, requiring a ton of bit twiddling, where protocol 2 can dump it raw. Protocol 2 is also the only Py2 protocol that efficiently pickles new style classes, and the only one that can properly pickle certain types of instances (stuff using __slots__
/__new__
and the like) at all.
I'd also recommend the script begin with:
try:
import cPickle as pickle
except ImportError:
import pickle
as on Python 2, pickle
is implemented in pure Python, and is both slow and unable to use some of the more efficient pickle codes. On Python 3, cPickle
is gone, but pickle
is automatically accelerated. Between that and using protocol 2, pickling on the Python 2 machine should run much faster, and produce much smaller pickles.
Upvotes: 3
Reputation: 13878
pC = pickle.loads(f.read())
is what you're looking for, but you should really be using the with
context:
with open(fName, 'rb') as f:
pC = pickle.loads(f.read())
This would ensure your file is closed properly, especially because your code doesn't have a f.close()
in the function.
Upvotes: 3