julia456124645
julia456124645

Reputation: 51

Extract all link in a json file

I'm trying to learn python at the moment i managed to get a huge json file, i want to extract from it all the links and download them.

import json
import urllib3
urllib3.disable_warnings()
url = 'https://www.reddit.com/r/EarthPorn/top/.json'
http = urllib3.PoolManager()
suffix = ['.jpg','.png','.gif','.bmp']
while True:
    response = http.request('GET',url)
    myData = response.data
    parsedJson = json.loads(myData)
    finalUrl = parsedjson[0]['data']['children'][0]['data']['url']
    print(finalUrl)

at the moment i get an error on the line of finalUrl so i think i'm having an error trying to get each url in the json file.

source: https://www.reddit.com/r/earthporn/top/.json

Upvotes: 1

Views: 3035

Answers (3)

Arun
Arun

Reputation: 2003

Actually, you are not iterating through all the children in the response. So you need to change the code to:

import json
import urllib3

urllib3.disable_warnings()
url = 'https://www.reddit.com/r/EarthPorn/top/.json'
http = urllib3.PoolManager()
suffix = ['.jpg','.png','.gif','.bmp']
response = http.request('GET', url)
myData = response.data
parsedJson = json.loads(myData)
for children in parsedjson['data']['children']:
    finalUrl = children['data']['url']
    print(finalUrl)

Upvotes: 0

Prem
Prem

Reputation: 85

Check if the 0s that you use are keys or just list indices. You may have to use '0' for the second [0].

In your code:

finalUrl = parsedjson[0]['data']['children'][0]['data']['url']

Suggestion:

finalUrl = parsedjson[0]['data']['children']['0']['data']['url']

Upvotes: 0

Related Questions