Brindavoine
Brindavoine

Reputation: 33

Python - How to retrieve element from json

Aloha,

My python routine will retrieve json from site, then check the file and download another json given the first answer and eventually download a zip. The first json file gives information about doc. Here's an example :

[
    {
        "id": "d9789918772f935b2d686f523d066a7b",
        "originalName": "130010259_AC2_R44_20200101",
        "type": "SUP",
        "status": "document.deleted",
        "legalStatus": "APPROVED",
        "name": "130010259_SUP_R44_AC2",
        "grid": {
            "name": "R44",
            "title": "GRAND EST"
        },
        "bbox": [
            3.4212881,
            47.6171589,
            8.1598899,
            50.1338684
        ],
        "documentSource": "UPLOAD",
        "uploadDate": "2020-06-25T14:56:27+02:00",
        "updateDate": "2021-01-19T14:33:35+01:00",
        "fileIdentifier": "SUP-AC2-R44-130010259-20200101",
        "legalControlStatus": 101
    },
    {
        "id": "6a9013bdde6acfa632861aeb1a02942b",
        "originalName": "130010259_AC2_R44_20210101",
        "type": "SUP",
        "status": "document.production",
        "legalStatus": "APPROVED",
        "name": "130010259_SUP_R44_AC2",
        "grid": {
            "name": "R44",
            "title": "GRAND EST"
        },
        "bbox": [
            3.4212881,
            47.6171589,
            8.1598899,
            50.1338684
        ],
        "documentSource": "UPLOAD",
        "uploadDate": "2021-01-18T16:37:01+01:00",
        "updateDate": "2021-01-19T14:33:29+01:00",
        "fileIdentifier": "SUP-AC2-R44-130010259-20210101",
        "legalControlStatus": 101
    },
    {
        "id": "efd51feaf35b12248966cb82f603e403",
        "originalName": "130010259_PM2_R44_20210101",
        "type": "SUP",
        "status": "document.production",
        "legalStatus": "APPROVED",
        "name": "130010259_SUP_R44_PM2",
        "grid": {
            "name": "R44",
            "title": "GRAND EST"
        },
        "bbox": [
            3.6535762,
            47.665021,
            7.9509455,
            49.907347
        ],
        "documentSource": "UPLOAD",
        "uploadDate": "2021-01-28T09:52:31+01:00",
        "updateDate": "2021-01-28T18:53:34+01:00",
        "fileIdentifier": "SUP-PM2-R44-130010259-20210101",
        "legalControlStatus": 101
    },
    {
        "id": "2e1b6104fdc09c84077d54fd9e74a7a7",
        "originalName": "444619258_I4_R44_20210211",
        "type": "SUP",
        "status": "document.pre_production",
        "legalStatus": "APPROVED",
        "name": "444619258_SUP_R44_I4",
        "grid": {
            "name": "R44",
            "title": "GRAND EST"
        },
        "bbox": [
            2.8698336,
            47.3373246,
            8.0881368,
            50.3796449
        ],
        "documentSource": "UPLOAD",
        "uploadDate": "2021-04-19T10:20:20+02:00",
        "updateDate": "2021-04-19T14:46:21+02:00",
        "fileIdentifier": "SUP-I4-R44-444619258-20210211",
        "legalControlStatus": 100
    }
]

What I try to do is to retrieve "id" from this json file. (ex. "id": "2e1b6104fdc09c84077d54fd9e74a7a7",)

I've tried

import json
from jsonpath_rw import jsonpath, parse
import jsonpath_rw_ext as jp

with open('C:/temp/gpu/SUP/20210419/SUPGE.json') as f:
    d = json.load(f)
    data = json.dumps(d)
    print("oriName: {}".format( jp.match1("$.id[*]",data) ) )
    

It doesn't work In fact, I'm not sure how jsonpath-rw is intended to work. Thankfully there was this blogpost But I'm still stuck.

Does anyone have a clue ?

With the id, I'll be able to download another json and in this json there'll be an archiveUrl to get the zipfile.

Thanks in advance.

Upvotes: 0

Views: 449

Answers (2)

Brindavoine
Brindavoine

Reputation: 33

Ok.

Here's what I've done.


import json
import urllib

# not sure it's the best way to load json from url, but it works fine 
# and I could test most of code if needed.
def getResponse(url):
    operUrl = urllib.request.urlopen(url)
    if(operUrl.getcode()==200):
        data = operUrl.read()
        jsonData = json.loads(data)
    else:
        print("Erreur reçue", operUrl.getcode())
    return jsonData

# Here I get the json from the url. *
# That part will be in the final script a parameter, 
# because I got lot of territory to control
d = getResponse('https://www.geoportail-urbanisme.gouv.fr/api/document?documentFamily=SUP&grid=R44&legalStatus=APPROVED')
for i in d:
    if i['status'] == 'document.production' :
         print('id du doc en production :',i.get('id')) 
# here we parse the id to fetch the whole document. 
# Same server, same API but different url
         _URL = 'https://www.geoportail-urbanisme.gouv.fr/api/document/' + i.get('id')+'/details'
         d2 = getResponse(_URL)
         print('archive',d2['archiveUrl'])
         urllib.request.urlretrieve(d2['archiveUrl'], 'c:/temp/gpu/SUP/'+d2['metadata']+'.zip' )
                 
# I used wget in the past and loved the progression bar. 
# Maybe I'd switch to wget because of it. 
# Works fine.

Thanks for your answer. I'm delighted to see that even with only the json library you could do amazing things. Just normal stuff. But amazing. Feel free to comment if you think I've missed smthg.

Upvotes: 0

Iftekhar Chowdhury
Iftekhar Chowdhury

Reputation: 11

import  json

file = open('SUPGE.json')


with file as f:

    d = json.load(f)
    for i in d:
        print(i.get('id'))

this will give you id only.

d9789918772f935b2d686f523d066a7b
6a9013bdde6acfa632861aeb1a02942b
efd51feaf35b12248966cb82f603e403
2e1b6104fdc09c84077d54fd9e74a7a7

Upvotes: 1

Related Questions