Reputation: 8205
According to BeautifulSoup
documentation, it is possible to get the value of tag's attribute by using a code which looks like this :
from bs4 import BeautifulSoup
soup = BeautifulSoup('<b class="boldest">Extremely bold</b>')
tag = soup.b
tag['class']
Theoretically (that is, according to the doc), the output would be :
u'boldest'
However, when I execute the above code, it outputs :
['boldest']
So, is there something I'm missing ? How can I obtain a tag's attribute content as a plain unicode string ?
Upvotes: 0
Views: 254
Reputation: 2082
Check this section in the documentation:
Multi-valued attributes
HTML 4 defines a few attributes that can have multiple values. HTML 5 removes a couple of them, but defines a few more. The most common multi-valued attribute is class (that is, a tag can have more than one CSS class). Others include rel, rev, accept-charset, headers, and accesskey. Beautiful Soup presents the value(s) of a multi-valued attribute as a list:
tag['class'][0]
will give you the string
Upvotes: 1
Reputation: 241
tag['class'][0]
There are can be more than one class in tag, thats why it return list of values. If you sure there is only one class there - just get first element from list.
Upvotes: 1