DKM
DKM

Reputation: 1801

Unpack list of elements into pandas Data Frame

I am trying to create a pandas data frame based on some responses received from an API. I am able to access the elements which are not nested but not able to access the nested elements.

This is the part of the response I got.

    [{'account_currency': 'USD',
  'account_name': 'S series',
  'actions': [{'action_type': 'video_view', 'value': '1500'},
              {'action_type': 'post_reaction', 'value': '39'}],
  'video_p100_watched_actions': [{'action_type': 'video_view',
                                  'value': '200'}]},
 {'account_currency': 'USD',
  'account_name': 'S New series',
  'actions': [{'action_type': 'video_view', 'value': '1400'},
              {'action_type': 'post_reaction', 'value': '17'}],
  'video_p100_watched_actions': [{'action_type': 'video_view',
                                  'value': '1200'}]}]

My approach:

final_results = []
for obj in results:
    video_100 = obj['video_p100_watched_actions']
    actions = obj['actions']
    final_results.append([obj['account_name'], obj['account_currency']])

I am trying to append the post_reaction into the final_result but not able to access the element.

Expected Output:

Currency    Account name    Post Reaction   video view
USD         Series          39              200
USD         New Series      17              1200

Upvotes: 1

Views: 74

Answers (2)

jezrael
jezrael

Reputation: 862581

Use json_normalize with concat, convert value to integer and pivoting by DataFrame.pivot_table:

df1 = pd.json_normalize(results, 'actions', ['account_currency','account_name'])
df1 = df1[df1['action_type'].ne('video_view')]

df2 = pd.json_normalize(results, 'video_p100_watched_actions',
          ['account_currency','account_name'])

df = (pd.concat([df1, df2], ignore_index=True)
        .assign(value = lambda x: x['value'].astype(int))
        .pivot_table(index=['account_currency','account_name'], 
                     columns='action_type', 
                     values='value', 
                     aggfunc='sum')
        .reset_index())
print (df)
action_type account_currency  account_name  post_reaction  video_view
0                        USD  S New series             17        1200
1                        USD      S series             39         200
    

Your solution should be changed:

final_results = []
for obj in results:
    video_100 = dict([(x['action_type'], x['value']) 
                     for x in obj['video_p100_watched_actions']])
    actions = dict([(x['action_type'], x['value']) 
                    for x in obj['actions'] if x['action_type'] == 'post_reaction'])
    d={**{k:v for k,v in obj.items() if k not in ['actions','video_p100_watched_actions']},
         **video_100, **actions}
    final_results.append(d)

df = pd.DataFrame(final_results)
print (df)
  account_currency  account_name video_view post_reaction
0              USD      S series        200            39
1              USD  S New series       1200            17

Upvotes: 1

wvdgoot
wvdgoot

Reputation: 351

This code should print the value of post_reaction:

final_results = []
for obj in results:
    post_reaction = obj['actions'][1]['value']
    final_results.append(post_reaction)

print(final_results)

In the line post_reaction = obj['actions'][1]['value'], the part of ['actions'] accesses the value of the key actions. [1] accesses the second dictionary in that list. ['value'] accesses the value of the key value.

Upvotes: 0

Related Questions