user1124825
user1124825

Reputation: 359

Reshaping Pandas data frame (a complex case!)

I want to reshape the following data frame:

index   id     numbers
1111    5      58.99
2222    5      75.65
1000    4      66.54 
11      4      60.33
143     4      62.31
145     51     30.2
1       7      61.28

The reshaped data frame should be like the following:

id   1       2        3 
5    58.99   75.65    nan
4    66.54   60.33    62.31
51   30.2    nan      nan
7    61.28   nan      nan

I use the following code to do this.

import pandas as pd
dtFrame = pd.read_csv("data.csv")
ids = dtFrame['id'].unique()
temp = dtFrame.groupby(['id'])
temp2 = {}
for i in ids:
    temp2[i]= temp.get_group(i).reset_index()['numbers'] 
dtFrame = pd.DataFrame.from_dict(temp2)
dtFrame = dtFrame.T

Although the above code solve my problem but is there a more simple way to achieve this. I tried Pivot table but it does not solve the problem perhaps it requires to have same number of element in each group. Or may be there is another way which I am not aware of, please share your thoughts about it.

Upvotes: 3

Views: 116

Answers (1)

unutbu
unutbu

Reputation: 880299

In [69]: df.groupby(df['id'])['numbers'].apply(lambda x: pd.Series(x.values)).unstack()
Out[69]: 
        0      1      2
id                     
4   66.54  60.33  62.31
5   58.99  75.65    NaN
7   61.28    NaN    NaN
51  30.20    NaN    NaN

This is really quite similar to what you are doing except that the loop is replaced by apply. The pd.Series(x.values) has an index which by default ranges over integers starting at 0. The index values become the column names (above). It doesn't matter that the various groups may have different lengths. The apply method aligns the various indices for you (and fills missing values with NaN). What a convenience!

I learned this trick here.

Upvotes: 2

Related Questions