Shivaraman
Shivaraman

Reputation: 21

How to exchange data between two different python processes of different conda environments?

Two python scripts, A and B, have compatibility issues and need separate conda environments for them. Here's the scenario. When script A runs, it sends data to process B (Script B is running in a different terminal), and process B returns the output to process A (Process A cannot be put to sleep). I have been using pickle files for exchanging data between these two processes, but this method seems slow, and I would like to speed it up, which is necessary for my work.

Upvotes: 2

Views: 884

Answers (2)

stefano marchesini
stefano marchesini

Reputation: 101

You could use parent.py, calling child.sh to set up the conda environment, which then calls child.py. You may pass the communication parameters when you launch the child, this way you don't have to manually launch the child from another terminal. e.g. using zmq for communication (helper module below), passing the port number to the child:

parent.py

import numpy as np
import communicator

# set up the communicator
send, recv, port = communicator.setup(child = './child.sh')

def child_process(data):
    send(data) # send
    # something done by child
    return recv() #receive

# create data
data = np.random.random((3,3))

transformed_data = child_process(data)

print('data', data)
print('transformed data', transformed_data)

child.sh

# !/bin/bash
# set up environment variables and conda
#source /cds/sw/ds/ana/conda2/manage/bin/psconda.sh
#conda deactivate
#conda activate ps-4.5.26


# echo "using port: $1"
python child.py  $1 &

child.py

import numpy as np
import sys
import communicator

# get the port info
zmq_port = sys.argv[1] 

#connect to the port
send, recv, port = communicator.setup(zmq_port)

# get the data
data = recv()

# transform
data=data+1

# send back
data = send(data)

communicator.py

import zmq, pickle
#import zlib
from subprocess import call


context = zmq.Context()
zmq_socket = context.socket(zmq.PAIR)

# """pickle an object,  before sending it"""
send = lambda obj: zmq_socket.send(pickle.dumps(obj))
#  """receive then unpickle"""
recv = lambda : pickle.loads(zmq_socket.recv())


def setup(port=None, child = None): 

  if type(port)==type(None): # parent to select the port
     port_selected = zmq_socket.bind_to_random_port('tcp://*', min_port=50000, max_port=65004, max_tries=1000)
     if type(child) != type(None): 
       # call child with port info
       call(f'{child} {port_selected}', shell=True)
    

  else:                      # child is given the port 
     zmq_socket.connect("tcp://localhost:%s" % port)
     port_selected = int(port) 
  

  return  send, recv, port_selected

Upvotes: 0

Ahmed AEK
Ahmed AEK

Reputation: 17775

  1. make one program a child of the other using the subprocess module and have the communication over stdin and stdout. (fastest) (note you have to activate the other anaconda environment in the command to launch the child)
  2. have one application be a server and attach to a socket on localhost, the other application is going to be the client using the socket module. (most organized and scalable solution)
  3. make a part of the memory a shared memory that both applications can access and write and read from using multiprocessing.shared_memory (requires proper synchronization, but can be faster than first option for transferring GBs of data at a time), (wrapping it in an io.TextIOWrapper will make communication a lot easier, as easy as working with sockets)

Upvotes: 4

Related Questions