Reputation: 223
According to NumPy documentation here, by default, a matrix is saved with allow_pickle=True
, and furthermore, they tell what could be problematic with this default behavior:
allow_pickle : bool, optional
Allow saving object arrays using Python pickles. Reasons for disallowing pickles include security (loading pickled data can execute arbitrary code) and portability (pickled objects may not be loadable on different Python installations, for example if the stored objects require libraries that are not available, and not all pickled data is compatible between Python 2 and Python 3).
Default: True
After reading it, I would of course prefer to use allow_pickle=False
- but they do not tell what is different when it is used this way. There must be some reason they use allow_pickel=True
by default despite its disadvantages.
Could you please tell whether you use allow_pickle=False
and how it behaves differently?
Upvotes: 18
Views: 20490
Reputation: 5019
An object array is just a normal numpy array where the dtype
is object
; this happens if the contents of the array aren't of the normal numerical types (like int
or float
, etc.). We can try out saving a numpy array with objects, just to test how this works. A simple kind of object would be a dict
:
>>> import numpy as np
>>> a = np.array([{x: 1} for x in range(4)])
>>> a
array([{0: 1}, {1: 1}, {2: 1}, {3: 1}], dtype=object)
>>> np.save('test.pkl', a)
Loading this back works fine:
>>> np.load('test.pkl.npy')
array([{0: 1}, {1: 1}, {2: 1}, {3: 1}], dtype=object)
The array can't be saved without using pickle, though:
>>> np.save('test.pkl', a, allow_pickle=False)
...
ValueError: Object arrays cannot be saved when allow_pickle=False
The rule of thumb for pickles is that you're safe if you're loading a pickle that you made, but you should be careful about loading pickles that you got from somewhere else. For one thing, if you don't have the same libraries (or library versions) installed that were used to make the pickle, you might not be able to load the pickle (this is what's meant by portability above). Security is another potential concern; you can read a bit about how pickles can be abused in this article, for instance.
Upvotes: 15