Wiktor Lippa
Wiktor Lippa

Reputation: 11

What is the best way to find the index of an element (key or value) in complex json file in python?

I have a complex JSON file, composed of multiple embedded list and dictionaries. Somewhere in this file, there is an element "ABC": it can be a list element, key or value. Which method could I use to search through the file to find the index of this element?

Example:

{"Record": 
    {"RecordType": "No",
    "RecordNumber": 11,
    "Section": [
      {
        "Heading": "Structure",
        "Description": "Compound",
        "Information": [
          {
            "ReferenceNumber": 88,
            "Name": "2D Structure",
            "BoolValue": true
          }
        ]
      },]}}

I would like to search for 2D Structure and have Python return: {"Record"}{"Section"}[0]{"Information"}[0]{"Name"}.

I tried to search for some "reverse dictionaries" - where I could parse a string a have a location returned, but didn't find anything that would work. But maybe there is some simple solution?

Upvotes: 0

Views: 81

Answers (1)

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95948

I'll assume you've deseralized your JSON into a Python object:

import json
with open('path/to/my.json') as f:
    obj = json.load(f)

Now, the "simplest" solution is to brute-force search the entire nested structure. Here is a quick function I cooked up that does that. It's not very efficient, but it works:

def search_nested(obj, target, acc):
    if isinstance(obj, list):
        for i, e in enumerate(obj):
            if isinstance(e, (list, dict)):
                x = search_nested(e, target,  acc + [i])
                if x:
                    return x
            elif e == target:
                return acc + [i]
    elif isinstance(obj, dict):
        for k, v in obj.items():
            if target == v:
                return acc + [k]
            elif target == k: #how do you want to handle this?
                return acc # Maybe?
            elif isinstance(v, (list, dict)):
                x = search_nested(v, target, acc +[k])
                if x:
                    return x

So, in the repl:

In [3]: obj
Out[3]:
{'Record': {'RecordNumber': 11,
  'RecordType': 'No',
  'Section': [{'Description': 'Compound',
    'Heading': 'Structure',
    'Information': [{'BoolValue': True,
      'Name': '2D Structure',
      'ReferenceNumber': 88}]}]}}

In [4]: search_nested(obj, "2D Structure", [])
Out[4]: ['Record', 'Section', 0, 'Information', 0, 'Name']

Upvotes: 2

Related Questions