Alan
Alan

Reputation: 17

How to handle missing data in JSON

I have a JSON File() which looks like this.

[{
"query": "In China Cash flow from operating activities was $670 million, up 19%.ABB's regional and country order trends for the third quarter are illustrated on Slide 4.",
"topScoringIntent": {
    "intent": "Operating environment",
    "score": 0.7391448
},
"intents": [{
    "intent": "Operating environment",
    "score": 0.7391448
}, {
    "intent": "Value Proposition",
    "score": 0.0394879468
}, {
    "intent": "Competitive advantage",
    "score": 0.02228919
}, {
    "intent": "Operational efficiency",
    "score": 0.0117919622
}, {
    "intent": "Critical resources",
    "score": 0.00395535352
}, {
    "intent": "None",
    "score": 0.0015990251
}],
"entities": [{
    "entity": "in china",
    "type": "Regional Demand",
    "startIndex": 0,
    "endIndex": 7,
    "score": 0.9902908
}],
"compositeEntities": [{
    "parentType": "Regional Demand",
    "value": "in china",
    "children": []
    }]
 },{
 "query": "Specifically, in the United States, our largest market, Electrification order growth was robust apart from large orders, whichhad a tough comparison base.",
"topScoringIntent": {
    "intent": "Competitive advantage",
    "score": 0.252725929
},
"intents": [{
    "intent": "Competitive advantage",
    "score": 0.252725929
}, {
    "intent": "Operating environment",
    "score": 0.06572733
}, {
    "intent": "Operational efficiency",
    "score": 0.0437437519
}, {
    "intent": "Value Proposition",
    "score": 0.0294999164
}, {
    "intent": "Critical resources",
    "score": 0.00545410533
}, {
    "intent": "None",
    "score": 0.00353605044
}],
"entities": []
}]

As per the suggestion.. I have changed my code to use "get" function as below.

def getNested(dictionary, values):
if len(values) == 1:
    return dictionary.get(values[0], "")
elif len(values) > 1:
    cur = values.pop(0)
    item = dictionary.get(cur)
    if item:
        return getNested(item, values)

The below code is working as expected and is giving me the correct output.

import json
with open('responsefile1.json') as f:
     data = json.load(f)

for state in data:
     print(getNested(state, ["topScoringIntent", "intent"]),getNested(state, ["query"]),getNested(state,["entities"]))

Output:

Operating environment In China Cash flow from operating activities was $670 million, up 19%.ABB's regional and country order trends for the third quarter are illustrated on Slide 4 [{'entity': 'in china', 'type': 'Regional Demand', 'startIndex': 0, 'endIndex': 7, 'score': 0.9902908}]
Competitive advantage Specifically, in the United States, our largest market, Electrification order growth was robust apart from large orders, whichhad a tough comparison base. []

Why isn't the below code giving me the correct output?

import json
with open('C:/Users/Alankar.Gupta/python Scripts/responsefile1.json') as f:
    data = json.load(f)

for state in data:
    print(getNested(state, ["topScoringIntent", "intent"]),getNested(state, ["query"]),getNested(state,["entities","type"]))

Output:

<ipython-input-34-c06f0bb5b433> in getNested(dictionary, values)
  1 def getNested(dictionary, values):
  2     if len(values) == 1:
----> 3         return dictionary.get(values[0], "")
  4     elif len(values) > 1:
  5         cur = values.pop(0)

AttributeError: 'list' object has no attribute 'get'

Is there a problem with the list object? I am pretty new to python and not getting hold of it.

Upvotes: 1

Views: 1562

Answers (2)

Riddhesh Markandeya
Riddhesh Markandeya

Reputation: 589

Similar to dsvp9xyjsqmfvi8p's answer, just changing recursion to a for loop and adding list support as well.

def getNested(dictionary, keys):
    for key in keys:
        try:
            dictionary = dictionary[key]
        except (KeyError, IndexError, TypeError):
            return None
    return dictionary
getNested(state,["entities", 0,"type"]) # will return Regional Demand
getNested(state,["entities", 1,"type"]) # will return None

Upvotes: 0

dsvp9xyjsqmfvi8p
dsvp9xyjsqmfvi8p

Reputation: 152

To get an element in a nested dictionary we could write a recursive function

def getNested(dictionary, values):
    if len(values) == 1:
        return dictionary.get(values[0], "")
    elif len(values) > 1:
        cur = values.pop(0)
        item = dictionary.get(cur)
        if item:
            return getNested(item, values)

and you could call it on your dictionary like:

getNested(state, ["topScoringIntent", "intent"])

Upvotes: 1

Related Questions