Reputation: 6554
I'm trying to get JSON from a Google Trends URL, but I can't convert it to JSON because content goes as b''. How I can get this result as JSON?
My simple code:
import requests
r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
print(r.content)
r.content
starts with:
b')]}\'\n{"featuredStoryIds":[],"trendingStoryIds":["RU_lnk_iJ8H1AAwAACP-M_ru","RU_lnk_7H7L0wAwAAAnHM_ru","RU_lnk_Q-IB1AAwAABChM_ru","RU_lnk_EErj0wAwAADzKM_ru","RU_lnk_VY2s0wAwAAD57M_ru","RU_lnk_sdUP1AAwAAC-sM_ru","RU_lnk_ILv60wAwAADa2M_ru","RU_lnk_O6j70wAwAADAyM_ru","RU_lnk_fVQS1AAwAABvMM_ru","RU_lnk_TJ8D1AAwAABP-M_ru","RU_lnk_I97F0wAwAADmvM_ru","RU_lnk_tCrq0wAwAABeSM_ru","RU_lnk_W8EA1AAwAABbpM_ru","RU_lnk_IYX90wAwAADc5M_ru","RU_lnk_bz4M1AAwAABjWM_ru","RU_lnk_EJ-...
Decoding this with the r.json()
method fails:
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Upvotes: 0
Views: 4148
Reputation: 272
Maybe try this it it might help:
import requests
r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
page=r.status_code
print page
Upvotes: -2
Reputation: 1122392
You are contacting a Google service, and Google is prefixing JSON with some extra data to prevent JSON hijacking:
>>> import requests
>>> r = requests.get('https://trends.google.ru/trends/api/stories/latest?hl=ru&tz=-180&cat=all&fi=15&fs=15&geo=RU&ri=300&rs=15&sort=0')
>>> r.content[:10]
b')]}\'\n{"fea'
Note the )]}'
and newline characters at the start.
You need to remove this extra data first and manually decode; there are no other newlines in the payload so we can just split on the newline:
import json
json_body = r.text.splitlines()[-1]
json_data = json.loads(json_body)
I used Response.text
here to get decoded string data (the server sets the correct content type encoding in the headers).
This gives you a decoded dictionary:
>>> json_body = r.text.splitlines()[-1]
>>> json_data = json.loads(json_body)
>>> type(json_data)
<class 'dict'>
>>> sorted(json_data)
['date', 'featuredStoryIds', 'hideAllImages', 'storySummaries', 'trendingStoryIds']
Upvotes: 3