Reputation: 63
I am using BeautifulSoup to scrape a website. The retrieved resultset looks like this:
<td><span class="I_Want_This_Class_Name"></span><span class="other_name">Text Is Here</span></td>
From here, I want to retrieve the class name "I_Want_This_Class_Name". I can get the "Text Is Here" part no problem, but the class name itself is proving to be difficult.
Is there a way to do this using BeautifulSoup resultset or do I need to convert to a dictionary?
Thank you
Upvotes: 0
Views: 833
Reputation: 464
from bs4 import BeautifulSoup
doc = '''<td><span class="I_Want_This_Class_Name"></span><span class="other_name">Text Is Here</span></td>
'''
soup = BeautifulSoup(doc, 'html.parser')
res = soup.find('td')
out = {}
for each in res:
if each.has_attr('class'):
out[each['class'][0]] = each.text
print(out)
output will be like:
{'I_Want_This_Class_Name': '', 'other_name': 'Text Is Here'}
Upvotes: 1
Reputation: 280
If you are trying to get the class name for this one result, then I would use the select method on your soup object, calling the class key:
foo_class = soup.select('td>span.I_Want_This_Class_Name')[0]['class'][0]
Note here that the select method does return a list, hence the indexing before the key.
Upvotes: 1