Reputation: 151
need to run two python3 scripts simultaneously. The first script (app1.py) provides information to the second script (app2.py). Both scripts need to run together at the same time, ideally from a single script.
Script 1 is a bs4 based scraping script that runs infinitely in a loop without ever ending. Script 2 is a FLask web app that displays information from script 1. Is it possible to run Script 1 without importing it as this causes issues that stem from script 1 running in a infinite loop?
How do I run both scripts together from a single script?
Upvotes: 3
Views: 61
Reputation: 453
First, before adding complexity (particularly around concurrent programming) you should ask, do I really need to do this? Could the flask app trigger a new scrape on a request?
When doing tasks which need to run next to each other in python there are three main ways to do this:
Processes are separate things as far as the operating system is concerned, and contain threads. asyncio
is another way of thinking about this which allows you to forget about the OS.
Python has a feature called the Global Interpreter lock which basically means it can only interpret one line of bytecode at a time in a process. This means that if you application uses multithreading
one thread will freeze whilst another does other things. It should be noted that this limit only applies to interpreting the bytecode, if there is IO intensive work like a flask
server then you will probably find that there is enough time whilst the server is off doing stuff that you can still use multithreading
.
multiprocessing
?Alot of work has been put into making the interface between multithreading
and multiprocessing
very similar, so it adds very little complexity and just to be sure you weren't clogging up your server it might be easiest just to use multiprocessing
.
multithreading
?The down side with multiprocessing
is that python has to pickle data between your processes as they can't share memory like threads can. This compared to multithreading
is slow, however its still pretty fast for reasonable amounts of data. Remember "premature optimisation is the root of all evil", profile your code before and after optimising, to decide if it was worth it.
asyncio
asyncio
was added to python with the aim "making writing explicitly asynchronous, concurrent Python code easier and more Pythonic.", some people would disagree. I think you are best of trying it and seeing if it works for you. From the sounds of your application it isn't large enough to really benefit from the massive concurrency that asyncio
allows.
Personally I would choose multiprocessing
for this kind of thing.
It is generally not desirable for import my_script_which_loops
to hang for ever, instead you will often see something like the following:
# my_script_which_loops
def main():
while True:
print("I am scraping the thing!")
if __name__ == "__main__":
main()
This means that if you run \> python my_script_which_loops.py
then you will scrape the thing as intended, however if the script isn't the main script then importing it won't hang. Please see here for more info.
Upvotes: 3