Reputation: 1372
I have a list of dictionaries l1, that looks like this:
[{'A1': 'string',
'B1': {'ba': 'string',
'bb': 'string',
'bc': 'string',
'bd': 'string',
'be': 'string'},
'C1': {'ca': 'string',
'cb': [[[123,123],[123,123]]]},
'D1': 'string'},
...]
Some of the dictionaries (l1 elements) might have some of the keys missing, for example, the second list element of l1 might not have 'bc':''string' key/value pair.
I need to extract the following top and nested key/value elements into a dataframe, which will look like this:
bc bd cb D1
string string [[[123,123],[123,123]]] string
N/A string [[[123,123],[123,123]]] string
...
string N/A [[[123,123],[123,123]]] string
The code I have is below:
temp_df = pd.DataFrame(columns = ['bc','bd','cb','D1']
for i in l1:
temp_df = temp_df.append({'bc': i.get(['B1']['bc'],'N/A'),
'bd': i.get(['B1']['bd'],'N/A'),
'cb': i.get(['C1']['cb'],'N/A'),
'D1': i.get(['D1'],'N/A')},
ignore_index=True)
The error I am getting is below:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-113-543c6addad42> in <module>
1 for i in l1:
----> 2 temp_df = temp_df.append({'bc': i.get(['B1']['bc'],'N/A'),
3 'bd': i.get(['B1']['bd'],'N/A'),
4 'C1': i.get(['C1']['cb'],'N/A'),
5 'D1': i.get(['D1'],'N/A')},
TypeError: list indices must be integers or slices, not str
What am I doing wrong?
Upvotes: 1
Views: 387
Reputation: 9308
You can refer to @Psidom's answer for your error, however, to achieve what you are trying to do, you can alternatively use json_normalize
.
pd.json_normalize(l1).rename(columns={
'B1.bc': 'bc',
'B1.bd': 'bd',
'C1.cb': 'cb'
}).fillna('N/A')[['bc', 'bd', 'cb', 'D1']]
Upvotes: 3
Reputation: 214957
Instead of i.get(['B1']['bc'],'N/A')
, which is not valid python syntax, using i.get('B1', {}).get('bc', 'N/A')
to get nested keys. Also don't dynamically append to pandas dataframe, it's slow. Append to list first and then convert the list to dataframe.
lst = []
for i in l1:
lst.append({
'bc': i.get('B1', {}).get('bc', 'N/A'),
'bd': i.get('B1', {}).get('bd', 'N/A'),
'cb': i.get('C1', {}).get('cb', 'N/A'),
'D1': i.get('D1', 'N/A')
})
pd.DataFrame(lst)
bc bd cb D1
0 string string [[[123, 123], [123, 123]]] string
Upvotes: 5