Reputation: 359
I want to reshape the following data frame:
index id numbers 1111 5 58.99 2222 5 75.65 1000 4 66.54 11 4 60.33 143 4 62.31 145 51 30.2 1 7 61.28
The reshaped data frame should be like the following:
id 1 2 3 5 58.99 75.65 nan 4 66.54 60.33 62.31 51 30.2 nan nan 7 61.28 nan nan
I use the following code to do this.
import pandas as pd
dtFrame = pd.read_csv("data.csv")
ids = dtFrame['id'].unique()
temp = dtFrame.groupby(['id'])
temp2 = {}
for i in ids:
temp2[i]= temp.get_group(i).reset_index()['numbers']
dtFrame = pd.DataFrame.from_dict(temp2)
dtFrame = dtFrame.T
Although the above code solve my problem but is there a more simple way to achieve this. I tried Pivot table but it does not solve the problem perhaps it requires to have same number of element in each group. Or may be there is another way which I am not aware of, please share your thoughts about it.
Upvotes: 3
Views: 116
Reputation: 880299
In [69]: df.groupby(df['id'])['numbers'].apply(lambda x: pd.Series(x.values)).unstack()
Out[69]:
0 1 2
id
4 66.54 60.33 62.31
5 58.99 75.65 NaN
7 61.28 NaN NaN
51 30.20 NaN NaN
This is really quite similar to what you are doing except that the loop is replaced by apply
. The pd.Series(x.values)
has an index which by default ranges over integers starting at 0
. The index values become the column names (above). It doesn't matter that the various groups may have different lengths. The apply
method aligns the various indices for you (and fills missing values with NaN
). What a convenience!
I learned this trick here.
Upvotes: 2