dataviews
dataviews

Reputation: 3100

beautifulsoup get value of attribute using get_attr method

I'd like to print all items in the list, but not containing the style tag = the following value: "text-align: center"

test = soup.find_all("p")
for x in test:
    if not x.has_attr('style'):
        print(x)

Essentially, return me all items in list where style is not equal to: "text-align: center". Probably just a small error here, but is it possible to define the value of style in has_attr?

Upvotes: 1

Views: 439

Answers (2)

QHarr
QHarr

Reputation: 84465

If you wanted to consider a different approach you could use the :not selector

from bs4 import BeautifulSoup as bs

html = '''
<html>
<head>
<title>Try jsoup</title>
</head>
<body>
<p style="color:green">This is the chosen paragraph.</p>
<p style="text-align: center">This is another paragraph.</p>
</body>
</html>

'''
soup = bs(html, 'lxml')
items = [item.text for item in soup.select('p:not([style="text-align: center"])')]
print(items)

Upvotes: 1

Bitto
Bitto

Reputation: 8225

Just check if the specific style is present in the Tag's style. Style is not considered a multi-valued attribute and the entire string inside quotes is the value of style attribute. Using x.get("style",'') instead of x['style'] also handles cases in which there is no style attribute and avoids KeyError.

for x in test:
    if 'text-align: center' not in x.get("style",''):
        print(x)

You can also use list comprehension to skip a few lines.

test=[x for x in soup.find_all("p") if 'text-align: center' not in x.get("style",'')]
print(test)

Upvotes: 2

Related Questions