ev0lution37
ev0lution37

Reputation: 1179

Python - Return path of all JSON elements that match string

I'm trying to logically traverse a JSON in Python and return the path of any String value that equals a value. I'm trying to traverse it recursively, but if multiple elements match the comparison, it only returns he first:

test_json = {
    "a": {
        "b": {
            "c": {
                "d": "foo"
            }
        }
    },
    "1": {
        "2": {
            "3": "bar"
        }
    },
    "a1" : "foo"
}

def searchDict(d, path):
    for k,v in d.iteritems():
        if isinstance(v, dict):
            path.append(k)
            return searchDict(v, path)
        else:
            if v == "foo":
                path.append(k)
                path.append(v)
                return path

print searchDict(test_json, [])

I want this to have the capacity to return something like:

a -> b -> c -> d -> foo
a1 -> foo

But instead it only iterates through the first sub-dictionary:

['a', 'b', 'c', 'd', 'foo']

This is probably easier than I'm making it, just having trouble logically solving it. Any ideas?

Upvotes: 3

Views: 1868

Answers (3)

baconStrips
baconStrips

Reputation: 111

I wanted to add my answer inspired from Martin Gottweis's answer. Some logic was added to handle nested lists too:

def searchDict(d, path=[], sPath=[]):
    if isinstance(d, dict):
        for k,v in d.items():
            if v == "foo":
                sPath.append(path)
            searchDict(v, path + [k])
    elif isinstance(d, list):
        for i, obj in enumerate(d):
            searchDict(obj, path + [i])
    return sPath

Upvotes: 1

Martin Gottweis
Martin Gottweis

Reputation: 2739

What a good question. You are actually two tiny bits away from solving it yourself.

  1. appending to path variable will cause that the same variable will be used in all recursive calls. Using path + [k] will solve this. If it's hard to get try to use your code and print out path at the beginning of the searchdict function
  2. You are correctly using for cycle to iterate over the json file. However you are also returning inside the forcycle, which will stop it and other possibilities won't get explored. Either print out the results, add it to a result field or use python's generator to get the results

check out my modified working code. tried to make as little changes into your code so it's easy to understand.

test_json = {
    "a": {
        "b": {
            "c": {
                "d": "foo"
            }
        }
    },
    "1": {
        "2": {
            "3": "bar"
        }
    },
    "a1" : "foo"
}

def searchDict(d, path):
    for k,v in d.iteritems():
        if isinstance(v, dict):
            searchDict(v, path + [k])
        else:
            if v == "foo":
                print(path + [k] + [v])


searchDict(test_json, [])

Upvotes: 3

twiden
twiden

Reputation: 19

A couple of observations, and maybe hints:

  1. You want your algorithm to print all the paths, but you only keep one list. So you are adding all the paths to the same data structure. Sounds more like you want a list of lists and maybe an extra list variable that you pass along. Your choice
  2. You break the loop in a slightly strange place "return searchDict(v, path)" If there are more paths to be explored you are going to miss them here. If you remove the keyword return the code will act differently and I think you can solve it from there.

Good luck!

Upvotes: 1

Related Questions