gies0r
gies0r

Reputation: 5239

Pulling data from a queue in background thread in python process

Assuming you are processing a live stream of data like this:

async_loading_of_queued_data_in_python_process

What would be the best way to have a background Thread to update the data variable while the main logic can do_some_logic within the endless loop?

I have some experience with clear start and end point of parallelization using multiprocessing/multithreading, but I am unsure how to continously execute a background Thread updating an internal variable. Any advice would be helpfull - Thanks!

Upvotes: 1

Views: 1752

Answers (2)

wwii
wwii

Reputation: 23743

Have the background thread make separate DataFrames with data retrieved from the live feed that can be sent to the main thread and appended to a DataFrame in the main thread. The DataFrames should have the same structure.

  • Subclass threading.Thread
    • give it two attributes:
      • a reference to the live feed queue and
      • a reference to a main thread queue
    • in a continuous loop its run method should accumulate rows from the live feed queue in a dictionary
    • when a predetermined number of rows have been accumulated;
      • make a DataFrame from the dictionary
      • put the DataFrame on the main thread queue
      • make a new empty dictionary to be subsequently filled
  • In the main thread
    • make an empty DataFrame with the required columns
    • make a queue
    • make an instance of the Thread passing it the two queues
    • In a loop
      • check the queue: if anything is there append or concatenate it to the DataFrame
      • do stuff

Upvotes: 1

nz_21
nz_21

Reputation: 7343

Write an update function and periodically run a background thread.

def update_data(data):
    pass 
import threading
def my_inline_function(some_args):
    # do some stuff
    t = threading.Thread(target=update_data, args=some_args)
    t.start()
    # continue doing stuff

Understand the constraints of GIL so you know if threading is really what you need.

I'd suggest you to look into async/await to get a better idea of how threading actually works. It's a similar model to javascript: your main-program is single-threaded and it exploits IO-bound tasks to context switch into different parts of your application.

If this doesn't meet your requirements, look into multiprocessing - specifically, how to spin a new process and how to share variables between processes

Upvotes: 3

Related Questions