Reputation: 93
So basically I have this function that I run to check if a website exists or not. The problem is that it takes too much time. I have 50 different links to check one by one. Is there a faster way to do it? I'm currently using python 3. I've heard of the module threading but even after reading about it I'm still unsure about how to use it. Do you guys have any good resources I could read or look at to understand a little bit more about it?
Here's my code:
import requests
def checkExistingWeb():
for i in range(1, 51):
checkIfWebExist = requests.get("https://www.WEBSITE.com/ABCDE?id=" + str(i), allow_redirects=False)
if checkIfWebExist.status_code == 200:
print("\033[92mWeb Exists!\033[0m " + str(i))
else:
print("\033[91mWeb Does Not Exist!\033[0m " + str(i))
if __name__ == "__main__":
checkExistingWeb()
Thanks!
Upvotes: 0
Views: 821
Reputation: 26
Yeah its good use for multiple threads
import requests
def checkExistingWeb():
for i in range(1, 51):
checkIfWebExist = requests.get("https://www.WEBSITE.com/ABCDE?id=" + str(i), allow_redirects=False)
if checkIfWebExist.status_code == 200:
print("\033[92mWeb Exists!\033[0m " + str(i))
else:
print("\033[91mWeb Does Not Exist!\033[0m " + str(i))
if __name__ == "__main__":
with concurrent.futures.ThreadPoolExecutor() as executor:
results = executor.map(checkExistingWeb)
Upvotes: 0
Reputation: 4127
I really liked @Ronen's answer, so I looked into that excellent ThreadPoolExecutor
class. His answer was incomplete, as you will need to do a bit more as the processes haven't completed at the end of his snippet. You need to handle those Futures
. Here is an expanded answer:
executor = ThreadPoolExecutor()
checkweb_futures = [ executor.submit(checkExistWeb,I) for I in range(1, 51) ] # passed parameter
while not all(map(lambda x:x.done(),checkweb_futures)):
time.sleep(1)
checkweb_results = list(map(lambda x:x.results(),checkweb_futures))
Now you can check the results of all the calls.
Upvotes: 0
Reputation: 150
You can use multiprocessing.Pool
https://docs.python.org/2/library/multiprocessing.html
import requests, multiprocessing
def Check_Link(link):
checkIfWebExist = requests.get(link, allow_redirects=False)
if checkIfWebExist.status_code == 200:
print("\033[92mWeb Exists!\033[0m " + str(i))
else:
print("\033[91mWeb Does Not Exist!\033[0m " + str(i))
if __name__ == "__main__":
p = multiprocessing.Pool(5)
p.map(f, ["https://www.WEBSITE.com/ABCDE?id=" + str(i) for i in range(51)])
This code will create a thread pool and run this on the Check_Link function
Upvotes: 2
Reputation: 319
create the function for one execution, and have the loop outside of the function
executor = ThreadPoolExecutor()
checkweb_futures = [ executor.submit(checkExistWeb) for I in range(1, 51) ]
Upvotes: 0