Reputation: 788
I have a JSON of the following sort. In another table, I have JSONPaths, which tells me to get certain values. If I am to get the values of child elements, I need to get all the corresponding parent attribute values and store all of these values in a dataframe as a single row.
{
"Parent": {
"Name": "Bob",
"Age": "80",
"Children": [
{
"Name": "Michael",
"Gender":"M",
"Children": [
{
"Name": "Ezee",
"Gender": "M",
"Age": 20
},
{
"Name": "Ezee",
"Gender": "M",
"Age": 28,
"Children": [
{
"Name": "Dre",
"Age": 1
},
{
"Name": "George",
"Age": 2
}
]
}
],
"Age": 50,
"MiddleName": "Jay"
},
{
"Name": "Justin",
"Gender": "M",
"Children": [
{
"Name": "Emily",
"Age": 18,
"Gender": "F"
}
],
"Age": 45
}
]
}
}
Say I need to get the values for the JSON path: Parent/Children/0/Children/0/Name
, I need to get the attribute values of name, age, etc. for the corresponding parent (Parent/Children/0/[Name or Age or etc.]) and store all the above values as a single row.
Currently, I am able to get the parent values separately by passing the related path and children values separately by passing that path.
def findValue(path, json_data):
paths = path.split("/")
data = json_data
for i in range(0,len(paths)):
if isinstance(data, list):
paths[i]=int(paths[i])
data = data[paths[i]]
else:
data = data.get(paths[i])
return data
How could I achieve this?
Upvotes: 0
Views: 1349
Reputation: 20424
To get the last parent, you need to traverse your path up to the last time you follow a Children
list.
I.e. Given your path: 'Parent/Children/0/Children/0/Name'
you want to return the data for the parent at 'Parent/Children/0'
.
This is easy enough to do in Python, just slice the path
string up to the last occurrence of the substring /Children
:
path[:path.rfind('/Children')]
You can then use similar code to what you currently have to get the parent's data:
parent = json_data
path = path[:path.rfind('/Children')]
for attr in path.split('/'):
parent = parent[int(attr) if isinstance(parent, list) else attr]
which, for this example, would give us parent
as:
{
"Name": "Michael",
"Gender": "M",
"Children": [
{
"Name": "Ezee",
"Gender": "M",
"Age": 20
},
{
"Name": "Ezee",
"Gender": "M",
"Age": 28,
"Children": [
{
"Name": "Dre",
"Age": 1
},
{
"Name": "George",
"Age": 2
}
]
}
],
"Age": 50,
"MiddleName": "Jay"
}
And to complete your question fully, if you wanted this person's attributes (without their list of Children
) as a single row, you would have to decide to store them in a set way (such as alphabetically) and then you could use the .items()
method of a dict to extract these into the right format:
[v for k,v in sorted(t for t in parent.items() if t[0] != 'Children')]
giving, for our example:
[50, 'M', 'Jay', 'Michael']
#Age, Gender, Middle Name, Name
Oh, and the whole of the first code can be compressed to a one-liner if you wanted:
__import__('functools').reduce(lambda d,a:d[int(a) if isintance(d,list) else a], path[:path.rfind('/Children')].split('/'), json_data)
Upvotes: 1
Reputation: 1637
If I understand well, all you want is given a path like Parent/Children/0/Children/0/Name
get the path for the same property of its parent. In this case it would be Parent/Children/0/Name
Here is my try on a python interpreter, hope it helps you :
>>> path = "Parent/Children/0/Children/0/Name"
>>> path_l = path.split('/')
>>> rev = path_l[::-1]
>>> rev
['Name', '0', 'Children', '0', 'Children', 'Parent']
>>> rev.index('Children')
2
>>> rev = rev[rev.index('Children')+1:]
>>> rev
['0', 'Children', 'Parent']
>>> final = rev[::-1] + [path_l[-1]]
>>> final
['Parent', 'Children', '0', 'Name']
>>> parent_path = '/'.join(final)
>>> parent_path
'Parent/Children/0/Name'
Then using your function you can add your two value to a dataframe
>>> df = pandas.DataFrame({'Parent': [], 'Children':[]})
>>> df.append([parent], [children])
Upvotes: 1