getunstuck
getunstuck

Reputation: 63

Get the span class name using BeautifulSoup

I am using BeautifulSoup to scrape a website. The retrieved resultset looks like this:

<td><span class="I_Want_This_Class_Name"></span><span class="other_name">Text Is Here</span></td>

From here, I want to retrieve the class name "I_Want_This_Class_Name". I can get the "Text Is Here" part no problem, but the class name itself is proving to be difficult.

Is there a way to do this using BeautifulSoup resultset or do I need to convert to a dictionary?

Thank you

Upvotes: 0

Views: 833

Answers (2)

Raisul
Raisul

Reputation: 464

from bs4 import BeautifulSoup

doc = '''<td><span class="I_Want_This_Class_Name"></span><span class="other_name">Text Is Here</span></td>
'''
soup = BeautifulSoup(doc, 'html.parser')

res = soup.find('td')
out = {}
for each in res:
  if each.has_attr('class'):
    out[each['class'][0]] = each.text
print(out) 

output will be like:

{'I_Want_This_Class_Name': '', 'other_name': 'Text Is Here'}

Upvotes: 1

codyho
codyho

Reputation: 280

If you are trying to get the class name for this one result, then I would use the select method on your soup object, calling the class key:

foo_class = soup.select('td>span.I_Want_This_Class_Name')[0]['class'][0]

Note here that the select method does return a list, hence the indexing before the key.

Upvotes: 1

Related Questions