Reputation: 1096
I want to find all div
tags which have a certain pattern in their class
name but my code is not working as desired.
This is the code snippet
soup = BeautifulSoup(html_doc, 'html.parser')
all_findings = soup.findAll('div',attrs={'class':re.compile(r'common text .*')})
where html_doc
is the string with the following html
<div class="common text sighting_4619012">
<div class="hide-c">
<div class="icon location"></div>
<p class="reason"></p>
<p class="small">These will not appear</p>
<span class="button secondary ">wait</span>
</div>
<div class="show-c">
</div>
</div>
But all_findings
is coming out as an empty list while it should have found one item.
It's working in the case of exact match
all_findings = soup.findAll('div',attrs={'class':re.compile(r'hide-c')})
I am using bs4
.
Upvotes: 1
Views: 597
Reputation: 474141
To extend @Andy's answer, you can make a list of class names and compiled regular expressions:
soup.find_all('div', {'class': ["common", "text", re.compile(r'sighting_\d{5}')]})
Note that, in this case, you'll get the div
elements with one of the specified classes/patterns - in other words, it's common
or text
or sighting_
followed by five digits.
If you want to have them joined with "and", one option would be to turn off the special treatment for "class" attributes by having the document parsed as "xml":
soup = BeautifulSoup(html_doc, 'xml')
all_findings = soup.find_all('div', class_=re.compile(r'common text sighting_\d{5}'))
print all_findings
Upvotes: 0
Reputation: 50620
Instead of using a regular expression, put the classes you are looking for in a list:
all_findings = soup.findAll('div',attrs={'class':['common', 'text']})
Example code:
from bs4 import BeautifulSoup
html_doc = """<div class="common text sighting_4619012">
<div class="hide-c">
<div class="icon location"></div>
<p class="reason"></p>
<p class="small">These will not appear</p>
<span class="button secondary ">wait</span>
</div>
<div class="show-c">
</div>
</div>"""
soup = BeautifulSoup(html_doc, 'html.parser')
all_findings = soup.findAll('div',attrs={'class':['common', 'text']})
print all_findings
This outputs:
[<div class="common text sighting_4619012">
<div class="hide-c">
<div class="icon location"></div>
<p class="reason"></p>
<p class="small">These will not appear</p>
<span class="button secondary ">wait</span>
</div>
<div class="show-c">
</div>
</div>]
Upvotes: 2