Reputation: 50200
Some times a pickle dump is unexpectedly large. Assuming I can successfully pickle and unpickle an object, is there a way to inspect the dump and see exactly what is included?
Pickled objects include data but not code. If I didn't write the code, and the object is complex (e.g., an instance of a custom class with accessors, and lots of references to other data) it can be difficult to identify what is included in the dump and taking up so much space. Hence this question.
Upvotes: 3
Views: 1671
Reputation: 127240
The built-in pickletools module can output information about each opcode represented in a pickle file. When used from the command line or with dis
, it outputs the opcodes in a readable format. The example from the docs:
For example, with a tuple (1, 2) pickled in file x.pickle:
$ python -m pickle x.pickle (1, 2) $ python -m pickletools x.pickle 0: \x80 PROTO 3 2: K BININT1 1 4: K BININT1 2 6: \x86 TUPLE2 7: q BINPUT 0 9: . STOP highest protocol among opcodes = 2
To get detailed information about an opcode, look in the code2op
dict. Use genops
to iterate over the pickle data along with this detailed information. For example, \x86 TUPLE2
from above means:
>>> print(pickletools.code2op['\x86'].doc)
Build a two-tuple out of the top two items on the stack.
This code pops two values off the stack and pushes a tuple of
length 2 whose items are those values back onto it. In other
words:
stack[-2:] = [tuple(stack[-2:])]
Note that while a pickle is potentially unsafe to load (as it can execute arbitrary code), the pickle is not actually loaded when disassembling it, so it is safe to inspect the data.
Upvotes: 4