Reputation: 1149
The following is an example of the HTML code I want to parse:
<html>
<body>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>
I am using beautiful soup to parse the HTML code by selecting style8 as follows (where html reads the result of my http request):
html = result.read()
soup = BeautifulSoup(html)
content = soup.select('.style8')
In this example, the content
variable returns a list of 4 Tags. I want to check the content.text
, which contains the text of each style8
class, for each item in the list if it contains Example
and appends that to a variable. If it proceeds through the entire list and Example
does not occur within the list, it then appends Not present
to the variable.
I have got the following so far:
foo = []
for i, tag in enumerate(content):
if content[i].text == 'Example':
foo.append('Example')
break
else:
continue
This will only append Example
to foo
if it occurs, however it will not append Not Present
if it does not occur within the entire list.
Any method of doing so is appreciated, or better way of searching the entire results to check if a string is present would be great
Upvotes: 2
Views: 12486
Reputation: 4929
If you just want to check whether it was found or not, you could use a simple boolean flag as follow :
foo = []
found = False
for i, tag in enumerate(content):
if content[i].text == 'Example':
found = True
foo.append('Example')
break
else:
continue
if not found:
foo.append('Not Example')
If I get what you want, this may be a simple approach, though the solution of alecxe looks amazing.
Upvotes: 1
Reputation: 474061
You can use find_all()
to find all td
elements with class='style8'
and use list comprehension to construct the foo
list:
from bs4 import BeautifulSoup
html = """<html>
<body>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT: 5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>"""
soup = BeautifulSoup(html)
foo = ["Example" if "Example" in node.text else "Not Present"
for node in soup.find_all('td', {'class': 'style8'})]
print foo
prints:
['Example', 'Not Present', 'Not Present', 'Not Present']
Upvotes: 3