Reputation: 51
I'm trying to learn python at the moment i managed to get a huge json file, i want to extract from it all the links and download them.
import json
import urllib3
urllib3.disable_warnings()
url = 'https://www.reddit.com/r/EarthPorn/top/.json'
http = urllib3.PoolManager()
suffix = ['.jpg','.png','.gif','.bmp']
while True:
response = http.request('GET',url)
myData = response.data
parsedJson = json.loads(myData)
finalUrl = parsedjson[0]['data']['children'][0]['data']['url']
print(finalUrl)
at the moment i get an error on the line of finalUrl so i think i'm having an error trying to get each url in the json file.
source: https://www.reddit.com/r/earthporn/top/.json
Upvotes: 1
Views: 3035
Reputation: 2003
Actually, you are not iterating through all the children in the response. So you need to change the code to:
import json
import urllib3
urllib3.disable_warnings()
url = 'https://www.reddit.com/r/EarthPorn/top/.json'
http = urllib3.PoolManager()
suffix = ['.jpg','.png','.gif','.bmp']
response = http.request('GET', url)
myData = response.data
parsedJson = json.loads(myData)
for children in parsedjson['data']['children']:
finalUrl = children['data']['url']
print(finalUrl)
Upvotes: 0
Reputation: 2079
Why don't you try using a loop to go through all the links
for i in parsedjson['data']['children']:
finalUrl =i['data']['url']
print(finalUrl)
https://i.sstatic.net/LHAf3.jpg
https://i.redd.it/szj6wnw2foi11.jpg
https://i.redd.it/5k8vgy173mi11.jpg
https://i.sstatic.net/UaMch.jpg
https://i.redd.it/9nab5nvi4mi11.jpg
https://i.redd.it/9zgnp3z1gmi11.jpg
https://i.redd.it/ulhtdcomsoi11.jpg
https://i.redd.it/yjthueewmmi11.jpg
https://i.redd.it/gtdm76o3yni11.jpg
https://i.redd.it/1j7ez5alloi11.jpg
https://i.imgur.com/8xNGW6T.jpg
https://i.redd.it/13fk1b3rhki11.jpg
https://i.sstatic.net/jCRXi.jpg
https://i.redd.it/qqfb57u53ni11.jpg
https://i.redd.it/17fs1whd3pi11.jpg
https://i.redd.it/kjwv5p15qni11.png
https://i.redd.it/oayns08fjqi11.jpg
https://i.sstatic.net/10mMp.jpg
https://i.redd.it/px53p4e2ski11.jpg
https://i.redd.it/ncjytopnami11.jpg
https://i.sstatic.net/K6njE.jpg
https://i.redd.it/ecbs9yao5ni11.jpg
https://i.redd.it/10210k2rpli11.jpg
https://i.redd.it/xxs7h8ng1qi11.jpg
https://i.redd.it/5toz9ercjni11.jpg
Hope this is what you are looking for
Upvotes: 1
Reputation: 85
Check if the 0s that you use are keys or just list indices. You may have to use '0' for the second [0].
In your code:
finalUrl = parsedjson[0]['data']['children'][0]['data']['url']
Suggestion:
finalUrl = parsedjson[0]['data']['children']['0']['data']['url']
Upvotes: 0