Beautiful soup returns nothing

Question

This is the HTML code:

42263 - Unencrypted Telnet Server

I am trying to print 42263 - Unencrypted Telnet Server using Beautiful Soup but the output is an empty element i.e, []

This is my Python code:

from bs4 import BeautifulSoup
import csv
import urllib.request as urllib2

with open(r"C:\Users\sourabhk076\Documents\CBS_1.html") as fp:
    soup = BeautifulSoup(fp.read(), 'html.parser')

divs = soup.find_all('div', attrs={'background':'#fdc431'})

print(divs)

Keyur Potdar · Accepted Answer

background is not an attribute of the div tag. The attributes of the div tag are:

{'xmlns': '', 'style': 'box-sizing: border-box; width: 100%; margin: 0 0 10px 0; padding: 5px 10px; background: #fdc431; font-weight: bold; font-size: 14px; line-height: 20px; color: #fff;'}

So, either you'll have to use

soup.find_all('div', attrs={'style': 'box-sizing: border-box; width: 100%; margin: 0 0 10px 0; padding: 5px 10px; background: #fdc431; font-weight: bold; font-size: 14px; line-height: 20px; color: #fff;'}

or, you can use the lambda function to check if background: #fdc431 is in the style attribute value, like this:

soup = BeautifulSoup('42263 - Unencrypted Telnet Server', 'html.parser')
print(soup.find(lambda t: t.name == 'div' and 'background: #fdc431' in t['style']).text)
# 42263 - Unencrypted Telnet Server

or, you can use RegEx, as shown by Jatimir in his answer.

Beautiful soup returns nothing

Answers (2)

Related Questions