Extracting dictionary nested elements into a Pandas dataframe

Question

I have a list of dictionaries l1, that looks like this:

[{'A1': 'string',
  'B1': {'ba': 'string',
        'bb': 'string',
        'bc': 'string',
        'bd': 'string',
        'be': 'string'},
  'C1': {'ca': 'string',
        'cb': [[[123,123],[123,123]]]},
  'D1': 'string'},
  ...]

Some of the dictionaries (l1 elements) might have some of the keys missing, for example, the second list element of l1 might not have 'bc':''string' key/value pair.

I need to extract the following top and nested key/value elements into a dataframe, which will look like this:

bc      bd      cb                        D1
string  string  [[[123,123],[123,123]]]   string
N/A     string  [[[123,123],[123,123]]]   string
...
string  N/A     [[[123,123],[123,123]]]   string

The code I have is below:

temp_df = pd.DataFrame(columns = ['bc','bd','cb','D1']

for i in l1:
    temp_df = temp_df.append({'bc': i.get(['B1']['bc'],'N/A'),
                              'bd': i.get(['B1']['bd'],'N/A'),
                              'cb': i.get(['C1']['cb'],'N/A'),
                              'D1': i.get(['D1'],'N/A')}, 
                               ignore_index=True)

The error I am getting is below:

 ---------------------------------------------------------------------------
 TypeError                                 Traceback (most recent call last)
  in 
       1 for i in l1:
 ----> 2     temp_df = temp_df.append({'bc': i.get(['B1']['bc'],'N/A'),
       3                                 'bd': i.get(['B1']['bd'],'N/A'),
       4                                 'C1': i.get(['C1']['cb'],'N/A'),
       5                                 'D1': i.get(['D1'],'N/A')}, 

  TypeError: list indices must be integers or slices, not str

What am I doing wrong?

akuiper · Accepted Answer

Instead of i.get(['B1']['bc'],'N/A'), which is not valid python syntax, using i.get('B1', {}).get('bc', 'N/A') to get nested keys. Also don't dynamically append to pandas dataframe, it's slow. Append to list first and then convert the list to dataframe.

lst = []
for i in l1:
    lst.append({
        'bc': i.get('B1', {}).get('bc', 'N/A'),
        'bd': i.get('B1', {}).get('bd', 'N/A'),
        'cb': i.get('C1', {}).get('cb', 'N/A'),
        'D1': i.get('D1', 'N/A')
    })

pd.DataFrame(lst)

       bc      bd                          cb      D1
0  string  string  [[[123, 123], [123, 123]]]  string

Extracting dictionary nested elements into a Pandas dataframe

Answers (2)

Related Questions