Sohan Das
Sohan Das

Reputation: 1620

json not loads data properly

I'm trying to snip a embedded json from a webpage and then passing the json object to json.loads(). First url is okay but when loading the second url it's return error

ValueError: Unterminated string starting at: line 1 column 2078 (char 2077)

here is the code

import requests,json
from bs4 import BeautifulSoup

urls = ['https://www.autotrader.co.uk/dealers/greater-manchester/manchester/williams-landrover-9994',
'https://www.autotrader.co.uk/dealers/warwickshire/stratford-upon-avon/guy-salmon-land-rover-stratford-upon-avon-9965'
]

for url in urls:
    r = requests.get(url)
    soup = BeautifulSoup(r.content,'lxml')
    scripts = soup.find_all('script')[0]
    data = scripts.text.split("window['AT_APOLLO_STATE'] = ")[1].split(';')[0]
    jdata = json.loads(data)
    print(jdata)

Upvotes: 0

Views: 98

Answers (2)

QHarr
QHarr

Reputation: 84465

Reason has been given. You could also regex out appropriate string

import requests,json

urls = ['https://www.autotrader.co.uk/dealers/greater-manchester/manchester/williams-landrover-9994',
'https://www.autotrader.co.uk/dealers/warwickshire/stratford-upon-avon/guy-salmon-land-rover-stratford-upon-avon-9965'
]

p = re.compile(r"window\['AT_APOLLO_STATE'\] =(.*?});", re.DOTALL)
for url in urls:
    r = requests.get(url)
    jdata = json.loads(p.findall(r.text)[0])
    print(jdata)

Missed a } in the original post.

Upvotes: 1

caot
caot

Reputation: 3328

If you print out scripts.text.split("window['AT_APOLLO_STATE'] = ")[1], you will see the follows that includes a ; right after and enthusiastic. So you get an invalid json string from scripts.text.split("window['AT_APOLLO_STATE'] = ")[1].split(';')[0]. And the data ends with and enthusiastic that is not a valid json string.

"strapline":"In our state-of-the-art dealerships across the U.K, Sytner Group represents the world’s most prestigious car manufacturers. All of our staff are knowledgeable and enthusiastic; making every interaction special by going the extra mile.",

Upvotes: 2

Related Questions