Reputation: 14037
here is what I am trying to do:
>>>import pandas as pd
>>>dftemp = pd.DataFrame({'a': [1] * 3 + [2] * 3, 'b': 'a a b c d e'.split()})
a b
0 1 a
1 1 a
2 1 b
3 2 c
4 2 d
5 2 e
6 3 f
how to transpose column 'b' grouped by column 'a', so that output looks like:
a b0 b1 b2
0 1 a a b
3 2 c d e
6 3 f NaN NaN
Upvotes: 3
Views: 1402
Reputation: 59549
Given the ordering of your DataFrame
you could find where the group changes and use np.split
to create a new DataFrame
.
import numpy as np
import pandas as pd
splits = dftemp[(dftemp.a != dftemp.a.shift())].index.values
df = pd.DataFrame(np.split(dftemp.b.values, splits[1:])).add_prefix('b').fillna(np.NaN)
df['a'] = dftemp.loc[splits, 'a'].values
b0 b1 b2 a
0 a a b 1
1 c d e 2
2 f NaN NaN 3
Upvotes: 0
Reputation: 51155
Using pivot_table
with cumcount
:
(df.assign(flag=df.groupby('a').b.cumcount())
.pivot_table(index='a', columns='flag', values='b', aggfunc='first')
.add_prefix('B'))
flag B0 B1 B2
a
1 a a b
2 c d e
3 f NaN NaN
Upvotes: 3
Reputation: 4607
You can try of grouping by column and flattening the values associated with group and reframe it as dataframe
df = df.groupby(['a'])['b'].apply(lambda x: x.values.flatten())
pd.DataFrame(df.values.tolist(),index=df.index).add_prefix('B')
Out:
B0 B1 B2
a
1 a a b
2 c d e
3 f None None
Upvotes: 2
Reputation: 3770
you could probably try something like this :
>>> dftemp = pd.DataFrame({'a': [1] * 3 + [2] * 2 + [3]*1, 'b': 'a a b c d e'.split()})
>>> dftemp
a b
0 1 a
1 1 a
2 1 b
3 2 c
4 2 d
5 3 e
>>> dftemp.groupby('a')['b'].apply(lambda df: df.reset_index(drop=True)).unstack()
0 1 2
a
1 a a b
2 c d None
3 e None None
Upvotes: 0