Reputation: 21
I have a pandas DataFrame with one column containing lists, like:
>>> import pandas as pd
>>> d = {'A': [1, 2, 3], 'B': [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]]}
>>> df = pd.DataFrame(data=d)
>>> df
A B
0 1 [0.1, 0.2, 0.3]
1 2 [0.4, 0.5, 0.6]
2 3 [0.7, 0.8, 0.9]
I can unpack these lists to individual columns
>>> df[['x','y','z']] = df.B.tolist()
>>> df
A B x y z
0 1 [0.1, 0.2, 0.3] 0.1 0.2 0.3
1 2 [0.4, 0.5, 0.6] 0.4 0.5 0.6
2 3 [0.7, 0.8, 0.9] 0.7 0.8 0.9
but would like to do this with a chaining compatible command.
I thought about using .assign
but here I need to define each variable explicitly and unpacking via lambdas gets a bit involved.
>>> (df.assign(q=lambda df_: df_.B.apply(lambda x: x[0]),
... w=lambda df_: df_.B.apply(lambda x: x[1]),
... u=lambda df_: df_.B.apply(lambda x: x[2])))
A B q w u
0 1 [0.1, 0.2, 0.3] 0.1 0.2 0.3
1 2 [0.4, 0.5, 0.6] 0.4 0.5 0.6
2 3 [0.7, 0.8, 0.9] 0.7 0.8 0.9
Is there a better way to do this?
Upvotes: 1
Views: 1132
Reputation: 21
Building on the great hints from @mozway, with two simplifications:
Use zip
to create a dict
inside of assign
:
df.assign(**dict(zip(['x', 'y', 'z'], zip(*df['B']))))
Upvotes: 1
Reputation: 262224
pipe
is always useful to chain anything:
(pd.DataFrame(d)
.pipe(lambda d: d.join(pd.DataFrame(d['B'].to_list(),
columns=['q', 'w', 'u'],
index=d.index))
)
)
Variant with pipe
+assign
:
df.pipe(lambda d: d.assign(**dict(zip(['q', 'w', 'u'], zip(*d['B'].to_list())))))
Output:
A B q w u
0 1 [0.1, 0.2, 0.3] 0.1 0.2 0.3
1 2 [0.4, 0.5, 0.6] 0.4 0.5 0.6
2 3 [0.7, 0.8, 0.9] 0.7 0.8 0.9
Upvotes: 0