Reputation: 2732
Is there an expression-based tool for querying python complex objects the same way one can do using XPath for XML or jsonpath for JSON?
I thought about serializing my object to JSON and then using jsonpath on it, but it seems to be a clumsy way of doing that.
Upvotes: 4
Views: 1473
Reputation: 338
Adding a non-library option here. A method that finds a nested element based on a dot notation string (can traverse nested dicts
and lists
), see below of Gist here:
from functools import reduce
import re
from typing import Any, Optional
def find_key(dot_notation_path: str, payload: dict) -> Any:
"""Try to get a deep value from a dict based on a dot-notation"""
def get_despite_none(payload: Optional[dict], key: str) -> Any:
"""Try to get value from dict, even if dict is None"""
if not payload or not isinstance(payload, (dict, list)):
return None
# can also access lists if needed, e.g., if key is '[1]'
if (num_key := re.match(r"^\[(\d+)\]$", key)) is not None:
try:
return payload[int(num_key.group(1))]
except IndexError:
return None
else:
return payload.get(key, None)
found = reduce(get_despite_none, dot_notation_path.split("."), payload)
# compare to None, as the key could exist and be empty
if found is None:
raise KeyError()
return found
# Test cases:
payload = {
"haystack1": {
"haystack2": {
"haystack3": None,
"haystack4": "needle"
}
},
"haystack5": [
{"haystack6": None},
{"haystack7": "needle"}
],
"haystack8": {},
}
find_key("haystack1.haystack2.haystack4", payload)
# "needle"
find_key("haystack5.[1].haystack7", payload)
# "needle"
find_key("[0].haystack5.[1].haystack7", [payload, None])
# "needle"
find_key("haystack8", payload)
# {}
find_key("haystack1.haystack2.haystack4.haystack99", payload)
# KeyError
Upvotes: 0
Reputation: 2732
I'm adding this answer for the sake of future researchers:
It seems jsonpath-rw is the library I was looking for since the beginning, since it does exactly what I originally requested.
Upvotes: 1
Reputation: 313
@vBobCat I'm currently on the search for a similar solution. Agreed that serializing and deserializing with json is not ideal. What did you end up going with?
I found http://objectpath.org/ to be close to the right solution for my use case though it lacks features in making arbitrary updates to fields that I need. Its syntax, though slightly different than JSONPath, expresses many of the things that JSONPath does.
Upvotes: 1
Reputation: 181
You can use built-in library json
to import json as a nested dictionary and traverse it using dictionary notation - root['level1_object']['level2_object']
. JSON-compatible object types are of course loaded as corresponding Python types.
For other types of data there are other libraries, which mostly behave in similar fashion.
My new favourite is Box, which allows you to traverse nested dictionaries using a dot notation.
Upvotes: 2
Reputation: 427
You might want to take a look at AST module: https://docs.python.org/2/library/ast.html
Upvotes: 1