Reputation: 537
I have a data frame that has as index a timestamp and a column that has list of dictionaries:
index var_A
2019-08-21 09:05:49 [{"Date1": "Aug 21, 2017 9:09:51 AM","Date2": "Aug 21, 2017 9:09:54 AM","Id": "d5e665e5","num_ins": 108,"num_del": 0, "time": 356} , {"Date1": "Aug 21, 2017 9:09:57 AM","Date2": "Aug 21, 2017 9:09:59 AM","Id": "d5e665e5","num_ins": 218,"num_del": 5, "time": 166}]
2019-08-21 09:05:59 [{"Date1": "Aug 21, 2017 9:10:01 AM","Date2": "Aug 21, 2017 9:11:54 AM","Id": "d5e665e5","num_ins": 348,"num_del": 72, "time": 3356} , {"Date1": "Aug 21, 2017 9:19:57 AM","Date2": "Aug 21, 2017 9:19:59 AM","Id": "d5e665e5","num_ins": 69,"num_del": 5, "time": 125}, {"Date1": "Aug 21, 2017 9:20:01 AM","Date2": "Aug 21, 2017 9:21:54 AM","Id": "f9e775f9","num_ins": 470,"num_del": 0, "time": 290} ]
2019-08-21 09:06:04 []
What I wish to achieve is a dataframe like:
index Date1 Date2 Id num_ins num_del time
2019-08-21 09:05:49 Aug 21, 2017 9:09:51AM Aug 21, 2017 9:09:54AM d5e665e5 0 108 356
2019-08-21 09:05:49 Aug 21, 2017 9:09:57AM Aug 21, 2017 9:09:59AM d5e665e5 218 5 166
2019-08-21 09:05:59 Aug 21, 2017 9:10:01AM Aug 21, 2017 9:11:54AM d5e665e5 348 72 3356
2019-08-21 09:05:59 Aug 21, 2017 9:19:57AM Aug 21, 2017 9:19:59AM d5e665e5 69 5 125
2019-08-21 09:05:59 Aug 21, 2017 9:20:01AM Aug 21, 2017 9:21:54AM f9e775f9 470 0 290
2019-08-21 09:06:04 NAN NAN NAN NAN NAN NAN
Upvotes: 2
Views: 1760
Reputation: 863631
Loop by each value with enumerate
, because duplicated inex values and create DataFrame
s, then create DataFrame for empty lists and last concat
together:
import ast
out = {}
for i, (k, v) in enumerate(df['var_A'].items()):
df = pd.DataFrame(v)
if df.empty:
out[(i, k)] = pd.DataFrame(index=[0], columns=['Id'])
else:
out[(i, k)] = df
df = pd.concat(out, sort=True).reset_index(level=[0,2], drop=True)
print (df)
Date1 Date2 \
2019-08-21 09:05:49 Aug 21, 2017 9:09:51 AM Aug 21, 2017 9:09:54 AM
2019-08-21 09:05:49 Aug 21, 2017 9:09:57 AM Aug 21, 2017 9:09:59 AM
2019-08-21 09:05:59 Aug 21, 2017 9:10:01 AM Aug 21, 2017 9:11:54 AM
2019-08-21 09:05:59 Aug 21, 2017 9:19:57 AM Aug 21, 2017 9:19:59 AM
2019-08-21 09:05:59 Aug 21, 2017 9:20:01 AM Aug 21, 2017 9:21:54 AM
2019-08-21 09:05:59 NaN NaN
Id num_del num_ins time
2019-08-21 09:05:49 d5e665e5 0.0 108.0 356.0
2019-08-21 09:05:49 d5e665e5 5.0 218.0 166.0
2019-08-21 09:05:59 d5e665e5 72.0 348.0 3356.0
2019-08-21 09:05:59 d5e665e5 5.0 69.0 125.0
2019-08-21 09:05:59 f9e775f9 0.0 470.0 290.0
2019-08-21 09:05:59 NaN NaN NaN NaN
Upvotes: 1
Reputation: 309
You can use pandas
functions stack
and concat
to do this.
stack
to unlist the list the column var_A
concat
to unnest the dictionary and put it into seperate columnsYou can use the following code to do the same. Assuming your dictionary to be df.
Unlist:
df = df.apply(lambda x: x.apply(pd.Series).stack()).reset_index().drop('level_1', 1)
Unnest:
df = pd.concat([df.drop('var_A', axis=1), df['var_A'].apply(pd.Series)], axis=1).drop(0,1)
Upvotes: 1