Reputation: 27
I want to find and print a list of links in page that contain the word "love".
Page example
<a href="http://example/foto-fujifilm/">i like love with you</a>
<a href="http://example/foto-fujifilm/">i don't like love</a>
<a href="http://example/foto-fujifilm/">love is my problem</a>
<a href="http://example/foto-fujifilm/">i don't now</a>
This my code
from bs4 import BeautifulSoup
import requests
url = raw_input("Enter a website to extract the URL's from: ")
r = requests.get("http://" +url)
data = r.text
soup = BeautifulSoup(data,'lxml')
for a in soup.find_all('a', string="*love*"):
print "Found the URL:", a['href']
How can use wildcard string to search love in text?
Upvotes: 1
Views: 1493
Reputation: 46513
Beautiful Soup also accepts regular expressions ...
import re
for a in soup.find_all('a', string=re.compile('love')):
print('Found the URL:', a['href'])
and functions.
for a in soup.find_all('a', string=lambda s: 'love' in s):
print('Found the URL:', a['href'])
EDIT:
For case insensitive searches:
re.compile('love', re.IGNORECASE)
and
lambda s: 'love' in s.lower()
Upvotes: 2