Reputation: 446
I have a dictionary that is 2280 bytes according to
sys.getsizeof(myDictionary)
when I save it to a file with pickle
with open("dictionary.txt", "wb") as fp: #Pickling
pickle.dump(myDictionary, fp)
it's suddenly about 100KB in size
Is it possible for me to get the exact binary representation of that dictionary and save it to a file? and later access this file again as a dictionary?
or if it isn't possible, maybe its possible with another programming language? It'd be important to have that file as small as possible
Upvotes: 1
Views: 508
Reputation: 44838
Quote from the docs about sys.getsizeof
:
Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.
Well, objects in Python refer to other objects a lot, so chances are, getsizeof
won't help here much.
For example:
>>> a = {'a': 1, 'b': 2}
>>> sys.getsizeof(a)
240 # WUT
len(pickle.dumps(a))
28 # looks legit
Then do:
>>> p = [1,2,3,4,5]
>>> a['k'] = p
>>> sys.getsizeof(a)
240 # WUT
>>> len(pickle.dumps(a))
51 # looks legit
So, the amount of memory this object consumes depends on representation, apparently. If you want to save only the dictionary, well, you'll have to save just a bunch of pointers to basically nowhere (since when you load the saved data they'll be invalid). You can use this recursive recipe to find the size of the objects and its contents.
If you want your file be as small as possible, consider compressing the values in the dictionary or use a different data representation.
Upvotes: 1