Reputation: 1414
I have managed to pull out most of the various attributes of a site I am scraping, but have come short trying to extract the value of something within the div declarator itself.
Specifically, assuming the following:
<div class="item" data-color="red" data-itemid="abc">Red Slippers</div>
I am after the value inside data-itemid > abc.
I cannot seem to get something that isn't looking at the value inside the div: i.e. Red Slippers, which is not what I am after.
I have tried the following, without luck:
item_id = soup.find('data-itemid')
Any ideas?
Upvotes: 3
Views: 111
Reputation: 402553
You can use the find_all
with a predicate to narrow your search, and then access that particular attribute with dict-like indexing.
from bs4 import BeautifulSoup
soup = BeautifulSoup(text, 'html5lib')
items = soup.find_all('div', {'class' : 'item'})
for item in items:
print(item['data-itemid'])
If you wish to further narrow down your search, you can just add more predicates to your dict, like this:
{'class' : 'item', 'data-color' : 'red', ...} # and so on
Upvotes: 4