Reputation: 138
Using BeautifulSoup4, I can isolate:
<a href="#" data-nutrition="{
"serving-name":"Milk, 2%",
"serving-size":"16 FL OZ",
"calories":"267"}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>
By running:
for i in soup('a', attrs={'data-nutrition' : True}):
sample = i
break
print(sample)
I need create the dictionary:
my_dict = {
'serving-name': 'Milk, 2%',
'serving-size': '16 FL OZ',
'calories': '267'
}
How can I do this using BeautifulSoup4 in Python?
Upvotes: 1
Views: 74
Reputation: 474161
Locate the element and use json.loads()
to load the data-nutrition
attribute value into the Python dictionary:
import json
from bs4 import BeautifulSoup
data = """
<a href="#" data-nutrition="{
"serving-name":"Milk, 2%",
"serving-size":"16 FL OZ",
"calories":"267"}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>"""
soup = BeautifulSoup(data, "html.parser")
a = soup.select_one("a[data-nutrition]")
nutrition = json.loads(a["data-nutrition"])
print(nutrition)
Prints:
{'serving-name': 'Milk, 2%', 'serving-size': '16 FL OZ', 'calories': '267'}
Upvotes: 1