Unpickle big python object through subprocess

Question

How can you pass and unpickle large objects through a subprocess. So my example below works for small object (dictionary), but stops working if it has large data in it:

Heres my working sample:

return_pickle.py

import pickle
import io
import sys

NUMS = 10
    
sample_obj = {'a':1, 'b': [x for x in range(NUMS)]}
d = pickle.dumps(sample_obj)
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding='latin-1')
print(d.decode('latin-1'), end='', flush=True)

unpickle.py

import subprocess
import pickle

proc = subprocess.Popen(["python", "return_pickle.py"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

output, err = proc.communicate()
data = pickle.loads(output)
print(data)

So the above works fine as is, but if I change NUMS to 100 it errors out with _pickle.UnpicklingError: invalid load key, '\x0a'. or if I change sample_obj to have a list of dictionaries, if the list is big I will get the same error. How do I get around this?

I am using Python 3.7 and on a Windows 10 machine

flakes · Accepted Answer

Works for me on a windows machine if you don't stringify the result and instead post it directly to the stdout buffer:

return_pickle.py

import pickle, sys

sample_obj = {'a':1, 'b': [x for x in range(100)]}
sys.stdout.buffer.write(pickle.dumps(sample_obj))

import subprocess, pickle

proc = subprocess.Popen(
    ["python", "return_pickle.py"],
    stdout=subprocess.PIPE,
    stderr=subprocess.DEVNULL,
)

output, _ = proc.communicate()
print(pickle.loads(output))

Unpickle big python object through subprocess

Answers (2)

Gilding the lily ;-)

Using an anonymous OS-level pipe

Related Questions