SalatYerakot
SalatYerakot

Reputation: 223

NumPy: consequences of using 'np.save()' with 'allow_pickle=False'

According to NumPy documentation here, by default, a matrix is saved with allow_pickle=True, and furthermore, they tell what could be problematic with this default behavior:

allow_pickle : bool, optional
Allow saving object arrays using Python pickles. Reasons for disallowing pickles include security (loading pickled data can execute arbitrary code) and portability (pickled objects may not be loadable on different Python installations, for example if the stored objects require libraries that are not available, and not all pickled data is compatible between Python 2 and Python 3).
Default: True

After reading it, I would of course prefer to use allow_pickle=False - but they do not tell what is different when it is used this way. There must be some reason they use allow_pickel=True by default despite its disadvantages.

Could you please tell whether you use allow_pickle=False and how it behaves differently?

Upvotes: 18

Views: 20490

Answers (1)

wildwilhelm
wildwilhelm

Reputation: 5019

An object array is just a normal numpy array where the dtype is object; this happens if the contents of the array aren't of the normal numerical types (like int or float, etc.). We can try out saving a numpy array with objects, just to test how this works. A simple kind of object would be a dict:

>>> import numpy as np
>>> a = np.array([{x: 1} for x in range(4)])
>>> a
array([{0: 1}, {1: 1}, {2: 1}, {3: 1}], dtype=object)
>>> np.save('test.pkl', a)

Loading this back works fine:

>>> np.load('test.pkl.npy')
array([{0: 1}, {1: 1}, {2: 1}, {3: 1}], dtype=object)

The array can't be saved without using pickle, though:

>>> np.save('test.pkl', a, allow_pickle=False)
...
ValueError: Object arrays cannot be saved when allow_pickle=False

The rule of thumb for pickles is that you're safe if you're loading a pickle that you made, but you should be careful about loading pickles that you got from somewhere else. For one thing, if you don't have the same libraries (or library versions) installed that were used to make the pickle, you might not be able to load the pickle (this is what's meant by portability above). Security is another potential concern; you can read a bit about how pickles can be abused in this article, for instance.

Upvotes: 15

Related Questions