arthem
arthem

Reputation: 151

Simultanious Python scripts?

need to run two python3 scripts simultaneously. The first script (app1.py) provides information to the second script (app2.py). Both scripts need to run together at the same time, ideally from a single script.

Script 1 is a bs4 based scraping script that runs infinitely in a loop without ever ending. Script 2 is a FLask web app that displays information from script 1. Is it possible to run Script 1 without importing it as this causes issues that stem from script 1 running in a infinite loop?

How do I run both scripts together from a single script?

Upvotes: 3

Views: 61

Answers (1)

Edwin Shepherd
Edwin Shepherd

Reputation: 453

Design

First, before adding complexity (particularly around concurrent programming) you should ask, do I really need to do this? Could the flask app trigger a new scrape on a request?

Concurency

When doing tasks which need to run next to each other in python there are three main ways to do this:

  1. multithreading
  2. multiprocessing
  3. asyncio

Processes are separate things as far as the operating system is concerned, and contain threads. asyncio is another way of thinking about this which allows you to forget about the OS.

Python has a feature called the Global Interpreter lock which basically means it can only interpret one line of bytecode at a time in a process. This means that if you application uses multithreading one thread will freeze whilst another does other things. It should be noted that this limit only applies to interpreting the bytecode, if there is IO intensive work like a flask server then you will probably find that there is enough time whilst the server is off doing stuff that you can still use multithreading.

Why go for multiprocessing?

Alot of work has been put into making the interface between multithreading and multiprocessing very similar, so it adds very little complexity and just to be sure you weren't clogging up your server it might be easiest just to use multiprocessing.

Why go for multithreading?

The down side with multiprocessing is that python has to pickle data between your processes as they can't share memory like threads can. This compared to multithreading is slow, however its still pretty fast for reasonable amounts of data. Remember "premature optimisation is the root of all evil", profile your code before and after optimising, to decide if it was worth it.

Why go for asyncio

asyncio was added to python with the aim "making writing explicitly asynchronous, concurrent Python code easier and more Pythonic.", some people would disagree. I think you are best of trying it and seeing if it works for you. From the sounds of your application it isn't large enough to really benefit from the massive concurrency that asyncio allows.

Personally I would choose multiprocessing for this kind of thing.

Imports

It is generally not desirable for import my_script_which_loops to hang for ever, instead you will often see something like the following:

# my_script_which_loops

def main():
    while True:
        print("I am scraping the thing!")

if __name__ == "__main__":
    main()

This means that if you run \> python my_script_which_loops.py then you will scrape the thing as intended, however if the script isn't the main script then importing it won't hang. Please see here for more info.

Upvotes: 3

Related Questions