Hunter Beach
Hunter Beach

Reputation: 23

How to parse empty JSON property/element in Python

I am attempting to parse some JSON that I am receiving from a RESTful API, but I am having trouble accessing the data in Python because it appears that there is an empty property name.

A sample of the JSON returned:

{
  "extractorData" : {
"url" : "RetreivedDataURL",
"resourceId" : "e38e1a7dd8f23dffbc77baf2d14ee500",
"data" : [ {
  "group" : [ {
    "CaseNumber" : [ {
      "text" : "PO-1994-1350",
      "href" : "http://www.referenceURL.net"
    } ],
    "DateFiled" : [ {
      "text" : "03/11/1994"
    } ],
    "CaseDescription" : [ {
      "text" : "Mary v. JONES"
    } ],
    "FoundParty" : [ {
      "text" : "Lastname, MARY BETH (Plaintiff)"
    } ]
  }, {
    "CaseNumber" : [ {
      "text" : "NP-1998-2194",
      "href" : "http://www.referenceURL.net"
    }, {
      "text" : "FD-1998-2310",
      "href" : "http://www.referenceURL.net"
    } ],
    "DateFiled" : [ {
      "text" : "08/13/1993"
    }, {
      "text" : "06/02/1998"
    } ],
    "CaseDescription" : [ {
      "text" : "IN RE: NOTARY PUBLIC VS REDACTED"
    }, {
      "text" : "REDACTED"
    } ],
    "FoundParty" : [ {
      "text" : "Lastname, MARY H (Plaintiff)"
    }, {
      "text" : "Lastname, MARY BETH (Defendant)"
    } ]
  } ]
} ]

And the Python code I am attempting to use

import requests
import json

FirstName = raw_input("Please Enter First name: ")
LastName = raw_input("Please Enter Last Name: ")


with requests.Session() as c:
url = ('https://www.requestURL.net/?name={}&lastname={}').format(LastName, FirstName)
page = c.get(url)
data = page.content

theJSON = json.loads(data)

def myprint(d):
stack = d.items()
while stack:
    k, v = stack.pop()
    if isinstance(v, dict):
        stack.extend(v.iteritems())
    else:
        print("%s: %s" % (k, v))

print myprint(theJSON["extractorData"]["data"]["group"])

I get the error:

TypeError: list indices must be integers, not str

I am new to parsing Python and more than simple python in general so excuse my ignorance. But what leads me to believe that it is an empty property is that when I use a tool to view the JSON visually online, I get empty brackets, Like so:

printscreen

Any help parsing this data into text would be of great help.

EDIT: Now I am able to reference a certain node with this code:

for d in group:
print group[0]['CaseNumber'][0]["text"]

But now how can I iterate over all the dictionaries listed in the group property to list all the nodes labeled "CaseNumber" because it should exist in every one of them. e.g

print group[0]['CaseNumber'][0]["text"]

then

for d in group:
print group[1]['CaseNumber'][0]["text"]

and so on and so forth. Perhaps incrementing some sort of integer until it reaches the end? I am not quite sure.

Upvotes: 1

Views: 213

Answers (1)

Rajesh Yogeshwar
Rajesh Yogeshwar

Reputation: 2179

If you look at json carefully the data key that you are accessing is actually a list, but data['group'] is trying to access it as if it were a dictionary, which is raising the TypeError.

To minify your json it is something like this

{
    "extractorData": {
        "url": "string",
        "resourceId": "string",
        "data": [{
            "group": []
        }]
    }
}

So if you want to access group, you should first retrieve data which is a list.

data = sample['extractorData']['data']

then you can iterate over data and get group within it

for d in data:
    group = d['group']

I hope this clarifies things a bit for you.

Upvotes: 2

Related Questions