gmarmstrong
gmarmstrong

Reputation: 138

Storing data from a tag in Python with BeautifulSoup4

Using BeautifulSoup4, I can isolate:

<a href="#" data-nutrition="{
    &quot;serving-name&quot;:&quot;Milk, 2%&quot;,
    &quot;serving-size&quot;:&quot;16 FL OZ&quot;,
    &quot;calories&quot;:&quot;267&quot;}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>

By running:

for i in soup('a', attrs={'data-nutrition' : True}):
    sample = i
    break
print(sample)

I need create the dictionary:

my_dict = {
    'serving-name': 'Milk, 2%',
    'serving-size': '16 FL OZ',
    'calories': '267'
}

How can I do this using BeautifulSoup4 in Python?

Upvotes: 1

Views: 74

Answers (1)

alecxe
alecxe

Reputation: 474161

Locate the element and use json.loads() to load the data-nutrition attribute value into the Python dictionary:

import json
from bs4 import BeautifulSoup


data = """
<a href="#" data-nutrition="{
    &quot;serving-name&quot;:&quot;Milk, 2%&quot;,
    &quot;serving-size&quot;:&quot;16 FL OZ&quot;,
    &quot;calories&quot;:&quot;267&quot;}">
Milk, 2%
<i class="icon-leaf icon-hidden-text">Meatless</i>
</a>"""
soup = BeautifulSoup(data, "html.parser")

a = soup.select_one("a[data-nutrition]")
nutrition = json.loads(a["data-nutrition"])
print(nutrition)

Prints:

{'serving-name': 'Milk, 2%', 'serving-size': '16 FL OZ', 'calories': '267'}

Upvotes: 1

Related Questions