Reputation: 2346
I am receiving data in batches from an API in JSON format. I wish to store only the values, in a list.
The raw data looks like this and will always look like this, i.e: all {...} will look like the first example:
data = content.get('data')
>>> [{'a':1, 'b':{'c':2, 'd':3}, 'e':4}, {...}, {...}, ...]
The nested dictionary is making this harder; I need this unpacked as well.
Here is what I have, which works but it feels so bad:
unpacked = []
data = content.get('data')
for d in data:
item = []
for k, v in d.items():
if k == 'b':
for val in v.values():
item.append(val)
else:
item.append(v)
unpacked.append(item)
Output:
>>> [[1,2,3,4], [...], [...], ...]
How can I improve this?
Upvotes: 3
Views: 14966
Reputation: 6363
For completeness, based on the excellent answer of Eric Duminil, here is a function that returns the maximum depth of a nested dict or list:
def depth(it, count=0):
"""Depth of a nested dict.
# Arguments
it: a nested dict or list.
count: a constant value used in internal calculations.
# Returns
Numeric value.
"""
if isinstance(it, list):
if any(isinstance(v, list) or isinstance(v, dict) for v in it):
for v in it:
if isinstance(v, list) or isinstance(v, dict):
return depth(v, count + 1)
else:
return count
elif isinstance(it, dict):
if any(isinstance(v, list) or isinstance(v, dict) for v in it.values()):
for v in it.values():
if isinstance(v, list) or isinstance(v, dict):
return depth(v, count + 1)
else:
return count
else:
return count
In the Python tradition, it is zero-based.
Upvotes: 1
Reputation: 6781
Doing recursively :
def traverse(d):
for key,val in d.items():
if isinstance(val, dict):
traverse(val)
else:
l.append(val)
out=[]
for d in data:
l=[]
traverse(d)
out.append(l)
print(out)
#driver values :
IN : data = [{'a':1, 'b':{'c':2, 'd':3}, 'e':4}, {'f':5,'g':6}]
OUT : out = [[1, 2, 3, 4], [5, 6]]
EDIT : A better way to do this is using yield
so as not to have to rely on global variables as in the first method.
def traverse(d):
for key,val in d.items():
if isinstance(val, dict):
yield from traverse(val)
else:
yield val
out = [list(traverse(d)) for d in data]
Upvotes: 0
Reputation: 6009
Other answers (especially @COLDSPEED's) have already covered the situation, but here is a slightly different code based on the old adage it's better to ask forgiveness than permission , which I tend to prefer to type checking:
def unpack(data):
try:
for value in data.values():
yield from unpack(value)
except AttributeError:
yield data
data = [{'a':1, 'b':{'c':2, 'd':3}, 'e':4}]
unpacked = [list(unpack(item)) for item in data]
Upvotes: 0
Reputation: 402573
Assuming your dictionaries do not contain inner lists, you could define a simple routine to unpack a nested dictionary, and iterate through each item in data using a loop.
def unpack(data):
for k, v in data.items():
if isinstance(v, dict):
yield from unpack(v)
else:
yield v
Note that this function is as simple as it is thanks to the magic of yield from
. Now, let's call it with some data.
data = [{'a':1, 'b':{'c':2, 'd':3}, 'e':4}, {'f':5,'g':6}] # Data "borrowed" from Kaushik NP
result = [list(unpack(x)) for x in data]
print(result)
[[2, 3, 1, 4], [5, 6]]
Note the lack of order in your result, because of the arbitrary order of dictionaries.
Upvotes: 2
Reputation: 54233
You could use a recursive function and some type tests:
data = [{'a':1, 'b':{'c':2, 'd':3}, 'e':4}, {'f':5,'g':6}]
def extract_nested_values(it):
if isinstance(it, list):
for sub_it in it:
yield from extract_nested_values(sub_it)
elif isinstance(it, dict):
for value in it.values():
yield from extract_nested_values(value)
else:
yield it
print(list(extract_nested_values(data)))
# [1, 2, 3, 4, 5, 6]
Note that it outputs a flat generator, not a list of lists.
Upvotes: 6