K.S
K.S

Reputation: 113

Extracting first key by value in nested json

I have a json format from which I have to extract the key:value very first time it appears in the file. Went through several examples, but getting an error.

{
     "search-results": {
     "entry": [
        {
            "@_fa": "true",
            "eid": "10-s2.0-60103485",
            "prism:url": 
"https://api.elsevier.com/content/affiliation/affiliation_id/60103485"
        },
        {
            "@_fa": "true",
            "eid": "10-s2.0-113310784",
            "prism:url": 
"https://api.elsevier.com/content/affiliation/affiliation_id/113310784"
        },

I tried :

myvar = results['search-results'][0]['entry']['eid']
print (myvar)

TypeError: string indices must be integers


results = json.dumps(resp.json(),
             sort_keys=True,
             indent=4, separators=(',', ': '))


print(results)
myvar = results['search-results'][0]['entry']['eid']
print (myvar)  

I need to extract "eid": "10-s2.0-60103485" the very first time it appears.

Upvotes: 0

Views: 1595

Answers (3)

bruno desthuilliers
bruno desthuilliers

Reputation: 77902

Your question is not quite clear about this point but if this :

results = json.dumps(resp.json(),
             sort_keys=True,
             indent=4, separators=(',', ': '))


print(results)
myvar = results['search-results'][0]['entry']['eid']
print (myvar)  

is the code that raised the TypeError, then well, that's very obviously what one would expect. json.dumps(obj, ...) takes a Python object and serializes it to a json string, so in your code results IS a string (the fact that it contains json-formatted content is totally irrelevant). To deserialize a json string to Python (so you get a python object that you can use), you want json.loads(somejsonstr) instead.

Now actually, resp.json() (where I assume resp to come from the python-requests lib) already takes care of deserializing it's content to Python, so you don't have anything more to do - except, of course, getting the dict keys / list indexes in the right order as already mentionned by James_F:

resp = requests.get(someurl)
results = resp.json()
myvar = results['search-results']['entry'][0]['eid']
print(myvar)

Upvotes: 1

Frans
Frans

Reputation: 837

Try using jsonpath_ng.ext.

from jsonpath_ng.ext import parse

f = parse("$..entry[?(@.eid=='10-s2.0-60103485')]").find(data)
print(f[0].value)

This outputs {'@_fa': 'true', 'eid': '10-s2.0-60103485', 'prism:url': 'https://api.elsevier.com/content/affiliation/affiliation_id/60103485'}

Upvotes: 0

James_F
James_F

Reputation: 449

You just need to flip the positions of ['entry'] and [0].

myvar = results['search-results']['entry'][0]['eid']
print (myvar)

Upvotes: 2

Related Questions