Reputation: 79
The purpose of my Python script is to look for a few different strings in the html of a few websites, if it finds one of the strings it will return a True flag.
Code:
import operator
import requests
import threading
# search for any of these items
search_for = ['about me', 'home page', 'website', 'submit your link', 'add a link']
# threads
threads = []
def send_get_request(link, search_for):
try:
html = requests.get(link)
except requests.exceptions.RequestException as e:
return False, e
text = html.text.lower()
if any(operator.contains(text, keyword.lower()) for keyword in search_for):
return (True, link)
else:
return (False, link)
def process_result(result):
if True in result:
with open("potentialLinks.txt", "a") as file:
file.write('{}\n'.format(str(result)))
print("Success: {}".format(str(result)))
else:
print("Failed: {}".format(str(result)))
def main():
# open and loop the links
with open("profiles.txt", "r") as links:
for link in links:
link = link.strip()
results = send_get_request(link, search_for)
process_result(results)
# entry point ...
if __name__ == '__main__':
main()
What i'm having issues with is:
if any(operator.contains(text, keyword.lower()) for keyword in search_for):
When it finds a keyword in the html, is it possible for me to return which one of the keywords it has found to cause the True flag to trigger?
I cannot think of the best way to do this, more than likely i am over thinkinging something small, thank you for any help on the matter.
Upvotes: 1
Views: 123
Reputation: 16876
found = None
for keyword in ["apple" ,"cat"]:
if keyword.lower() in "this is a cat and this is not":
found = keyword
break
And if you want all the matched keywords then use
[keyword for keyword in ["apple" ,"cat"] if keyword.lower() in "this is a cat and this is not an apple"]
Upvotes: 1
Reputation: 23089
import operator
search_for = ['cat', 'mouse']
text = "I want to kill my cat"
keywords = [kw for kw in search_for if operator.contains(text.lower(), kw.lower())]
print(keywords)
text = "I want to kill my cat because it ate my mouse"
keywords = [kw for kw in search_for if operator.contains(text.lower(), kw.lower())]
print(keywords)
Output:
['cat']
['cat', 'mouse']
You can check for a match by checking for the output list's length to be > 0
Upvotes: 1