Reputation: 537
I have a dataframe that after applying some filters looks like this:
index A .... J
55 7 .... [{'sqlStatement': 'DELETE FROM Z WHERE D=2000', 'number': 200, 'time':3556, 'timestamp': 'Jun 13, 2017 5:41:22 PM' }, {'sqlStatement': 'DELETE FROM U WHERE Z=100', 'number': 450, 'time':8906, 'timestamp': 'Jun 13, 2017 5:49:22 PM'}, {'sqlStatement': 'DELETE FROM U WHERE Z=150', 'number': 270, 'time':9806, 'timestamp': 'Jun 13, 2017 5:58:45 PM'}]
193 7 .... [{'sqlStatement': 'DELETE FROM T WHERE F=98', 'number': 8043, 'time':463465, 'timestamp': 'Jun 13, 2017 6:01:22 PM' }, {'sqlStatement': 'DELETE FROM F WHERE A=98 AND Z=100 ', 'number': 9890, 'time':487569, 'timestamp': 'Jun 13, 2017 6:09:28 PM'}]
I need to separate the column J into a new dataframe. For this, I use the following code:
for i, (k, v) in enumerate (df['J'].items()):
df = pd.DataFrame(v)
And I obtain:
index sqlStatement number time timestamp
1 DELETE FROM Z WHERE D=2000 200 3556 Jun 13, 2017 5:41:22 PM
2 DELETE FROM U WHERE Z=100 450 8906 Jun 13, 2017 5:41:22 PM
3 DELETE FROM U WHERE Z=150 270 9806 Jun 13, 2017 5:58:45 PM
4 DELETE FROM T WHERE F=98 8043 463465 Jun 13, 2017 6:01:22 PM
5 DELETE FROM T WHERE F=98 AND Z=100 9890 487569 Jun 13, 2017 6:09:28 PM
The problem is that I would like to add a column containing the index of the observation that generated these new ones. What I would like to achieve is:
index sqlStatement number time timestamp old_index
1 DELETE FROM Z WHERE D=2000 200 3556 Jun 13, 2017 5:41:22 PM 55
2 DELETE FROM U WHERE Z=100 450 8906 Jun 13, 2017 5:41:22 PM 55
3 DELETE FROM U WHERE Z=150 270 9806 Jun 13, 2017 5:58:45 PM 55
4 DELETE FROM T WHERE F=98 8043 463465 Jun 13, 2017 6:01:22 PM 193
5 DELETE FROM T WHERE F=98 AND Z=100 9890 487569 Jun 13, 2017 6:09:28 PM 193
Can you please help me?
Upvotes: 0
Views: 43
Reputation: 62533
j = [[{'number': 200,
'sqlStatement': 'DELETE FROM Z WHERE D=2000',
'time': 3556,
'timestamp': 'Jun 13, 2017 5:41:22 PM'},
{'number': 450,
'sqlStatement': 'DELETE FROM U WHERE Z=100',
'time': 8906,
'timestamp': 'Jun 13, 2017 5:49:22 PM'},
{'number': 270,
'sqlStatement': 'DELETE FROM U WHERE Z=150',
'time': 9806,
'timestamp': 'Jun 13, 2017 5:58:45 PM'}],
[{'number': 8043,
'sqlStatement': 'DELETE FROM T WHERE F=98',
'time': 463465,
'timestamp': 'Jun 13, 2017 6:01:22 PM'},
{'number': 9890,
'sqlStatement': 'DELETE FROM F WHERE A=98 AND Z=100 ',
'time': 487569,
'timestamp': 'Jun 13, 2017 6:09:28 PM'}]]
df = pd.DataFrame({'J': j})
pandas.DataFrame.explode
pandas v0.25
df_explode = df.explode('J')
pd.Series
to expand the dicts
:df_explode = df_explode.J.apply(pd.Series)
df_explode.reset_index(inplace=True)
df_explode.rename(columns={'index': 'old_index'})
reset_index
and df_explode.index
will be the original index.Upvotes: 1