Eran Moshe
Eran Moshe

Reputation: 3208

Pandas - Flatten a column which is a list of dictionaries

assuming I have the following DataFrame:

import pandas as pd
df = pd.DataFrame({'events': [ [{'event_text': 'hello1'}, {'event_text': 'hello2'}],
                                [{'event_text': 'whats up?'}],
                                [{'event_text': 'all good'}, {'event_text': 'bye'}] ]})

print(df)
                                              events
0  [{'event_text': 'hello1'}, {'event_text': 'hel...
1                      [{'event_text': 'whats up?'}]
2  [{'event_text': 'all good'}, {'event_text': 'b...

I'm trying to extract all texts into a single column like so:

0     hello1
1     hello2
2  whats up?
3   all good
4        bye

I think the solution involves json_normalize. I've tried the following:

from pandas.io.json import json_normalize
df['events'].apply(json_normalize)

But it yielded the following results:

0      event_text
0     hello1
1     hello2
1                   event_text
0  whats up?
2      event_text
0   all good
1        bye

any Pythonic way to handle this ?

Upvotes: 3

Views: 1399

Answers (1)

jezrael
jezrael

Reputation: 863166

Use flattening in list comprehension and get for select event_text, pass it to Series:

s = pd.Series([y.get('event_text') for x in df['events'] for y in x])
print (s)
0       hello1
1       hello2
2    whats up?
3     all good
4          bye
dtype: object

Upvotes: 8

Related Questions