Reputation: 47
I hava a structure of data like
[ {'uid': 'test_subject145', 'class':'?', 'data':[ {'chunk':1, 'writing':[ ['this is exciting'],[ 'you are good' ]... ]} ] },
{'uid': 'test_subject166', 'class':'?', 'data':[ {'chunk':2, 'writing':[ ['he died'],[ 'go ahead' ]... ]} ] }, ...]
it is a list contains many dictionaries, each have 3 pairs
'uid': 'test_subject145', 'class':'?', 'data':[]
.
in the last pair 'data'
, the value is a list, and it contain again a dictionary which have 2 pairs 'chunk':1, 'writing':[]
, in the pair 'writing', its value is a list containing again many lists.
What I want to extract is the content of all those sentence like 'this is exciting'
and 'you are good'
etc and put then into a simple list. Its final form should be list_final = ['this is exciting', 'you are good', 'he died',... ]
Upvotes: 3
Views: 1130
Reputation: 5821
tl;dr
[str for dic in data
for data_dict in dic['data']
for writing_sub_list in data_dict['writing']
for str in writing_sub_list]
Just go slow and do one layer at a time. Then refactor your code to make it smaller.
data = [{'class': '?',
'data': [{'chunk': 1,
'writing': [['this is exciting'], ['you are good']]}],
'uid': 'test_subject145'},
{'class': '?',
'data': [{'chunk': 2,
'writing': [['he died'], ['go ahead']]}],
'uid': 'test_subject166'}]
for d in data:
print(d)
# {'class': '?', 'uid': 'test_subject145', 'data': [{'writing': [['this is exciting'], ['you are good']], 'chunk': 1}]}
# {'class': '?', 'uid': 'test_subject166', 'data': [{'writing': [['he died'], ['go ahead']], 'chunk': 2}]}
for d in data:
data_list = d['data']
print(data_list)
# [{'writing': [['this is exciting'], ['you are good']], 'chunk': 1}]
# [{'writing': [['he died'], ['go ahead']], 'chunk': 2}]
for d in data:
data_list = d['data']
for d2 in data_list:
print(d2)
# {'writing': [['this is exciting'], ['you are good']], 'chunk': 1}
# {'writing': [['he died'], ['go ahead']], 'chunk': 2}
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
print(writing_list)
# [['this is exciting'], ['you are good']]
# [['he died'], ['go ahead']]
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
for writing_sub_list in writing_list:
print(writing_sub_list)
# ['this is exciting']
# ['you are good']
# ['he died']
# ['go ahead']
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
for writing_sub_list in writing_list:
for str in writing_sub_list:
print(str)
# this is exciting
# you are good
# he died
# go ahead
Then to convert to something smaller (but hard to read), rewrite the above code like this. It should be easy to see how to go from one to the other:
strings = [str for d in data for d2 in d['data'] for wsl in d2['writing'] for str in wsl]
# ['this is exciting', 'you are good', 'he died', 'go ahead']
Then, make it pretty with better names like Willem's answer:
[str for dic in data
for data_dict in dic['data']
for writing_sub_list in data_dict['writing']
for str in writing_sub_list]
Upvotes: 2
Reputation: 477533
Given your original list is named input
, simply use list comprehension:
[elem for dic in input
for dat in dic.get('data',())
for writing in dat.get('writing',())
for elem in writing]
You can use .get(..,())
such that if there is no such key, it still works: if there is no such key, we return the empty tuple ()
so there are no iterations.
Based on your sample input, we get:
>>> input = [ {'uid': 'test_subject145', 'class':'?', 'data':[ {'chunk':1, 'writing':[ ['this is exciting'],[ 'you are good' ]]} ] },
... {'uid': 'test_subject166', 'class':'?', 'data':[ {'chunk':2, 'writing':[ ['he died'],[ 'go ahead' ] ]} ] }]
>>>
>>> [elem for dic in input
... for dat in dic.get('data',())
... for writing in dat.get('writing',())
... for elem in writing]
['this is exciting', 'you are good', 'he died', 'go ahead']
Upvotes: 3
Reputation: 407
So I believe the below will work
lista = [ {'uid': 'test_subject145', 'class':'?', 'data':[ {'chunk':1, 'writing':[ ['this is exciting'],[ 'you are good' ]... ]} ] },
{'uid': 'test_subject166', 'class':'?', 'data':[ {'chunk':2, 'writing':[ ['he died'],[ 'go ahead' ]... ]} ] }, ...]
list_of_final_products = []
for itema in lista:
try:
for data_item in itema['data']:
for writa in data_item['writing']:
for writa_itema in writa:
list_of_final_products.append(writa)
except:
pass
This item, as referenced above, is I believe helpful in understanding - python getting a list of value from list of dict (thank you to McGrady)
Upvotes: 1