Reputation: 80346
Is there a way to get HTML tag attributes only when text=True
without specifying the tags.
Example:
html=<p class="c4">SOMETEXT</p>
I could do:
[tag.attrs for tag in soup.findAll('p')]
>>> [[(u'class', u'c1')]]
Is there a way to do:
[text.attrs for text in soup.findAll(text=True)]
Help much appriciated!
Upvotes: 1
Views: 1274
Reputation: 142106
Think you want this as the question has been clarified:
[tag.attrs for tag in soup.findAll(True) if tag.string]
.findAll(True)
returns all tags in the document, so they'll have an .attr
even if it's empty, and filter if the tag has .string
content.
Upvotes: 3
Reputation: 174614
>>> from bs4 import BeautifulSoup as bs
>>> html = '<p class="c4">SOMETEXT</p><p class="c5"></p>'
>>> soup = bs(html)
>>> [tag.attrs for tag in soup.findAll('p') if tag.string]
[{'class': ['c4']}]
Upvotes: 1