Stephano
Stephano

Reputation: 223

How can I get these Json code with Beautifulsoup?

JSON

 <script>
 var data2sales= 
 [{
   "key": "Owners",
   "bar": true,
   "values": [
     [1490400000000, 1591, "", "", ""],
     [1490486400000, 1924, "#2B6A94", "", ""],
     [1490572800000, 1982, "", "", ""],
     [1490659200000, 1606, "", "", ""]]
 }]
 </script>

My code to get Json in Python

 notices = str(soup.select('script')[30])
 split_words=notices.split('var data2sales= ')
 split_words=split_words[1]
 temp=split_words[44:689]
 temp = 'var data2sales= {' +temp + '}'
 print(temp)
 newDict = json.loads((temp))
 print(newDict)

I'm new to BeautifulSoup in Python and I'm trying to extract a dict from BeautifulSoup. As you can see in my code, I remake the JSON code with python and save in the newDict variable. But it doesn't work. Is there anyone can teach me, how can I extract that JSON code? Thank you.

Upvotes: 4

Views: 10402

Answers (2)

Youth overturn
Youth overturn

Reputation: 417

I just use len(eval(data.get_text())['data']['song']['list'])

Upvotes: 0

Sam Chats
Sam Chats

Reputation: 2321

Assuming the script above is within a string text, you can do something like the following:

import json
from bs4 import BeautifulSoup

soup = BeautifulSoup(text, 'html.parser')
script_text = soup.find('script').get_text()
relevant = script_text[script_text.index('=')+1:] #removes = and the part before it
data = json.loads(relevant) #a dictionary!
print json.dumps(data, indent=4)

Output:

[
    {
        "key": "Owners",
        "bar": true,
        "values": [
            [
                1490400000000,
                1591,
                "",
                "",
                ""
            ],
            [
                1490486400000,
                1924,
                "#2B6A94",
                "",
                ""
            ],
            [
                1490572800000,
                1982,
                "",
                "",
                ""
            ],
            [
                1490659200000,
                1606,
                "",
                "",
                ""
            ]
        ]
    }
]

Upvotes: 7

Related Questions