change dataframe to another structure according id

Question

My initial pandas dataframe looks as follows:

df = pd.DataFrame(np.array([[999,888,1,0], [988,899,2,0], [981,821,3,0],[721,789,4,1],[723,745,5,1],[444,432,6,2],[423,412,7,2],[423,455,8,2],[478,432,9,2]]),
                   columns=['a', 'b', 'c', 'id'])

>>>df
a    b   c id
999  888 1 0
988  899 2 0
981  821 3 0
721  789 4 1
723  745 5 1
444  432 6 2
423  412 7 2
423  455 8 2
478  432 9 2

Now I want to create a new dataframe with the following structure:

0    1    2    3   4  5
999  988  888  899 1  2
721  723  789  745 4  5
444  423  432  412 6  7

So in fact I just want to get the first two values of each id and each column (a,b,c). Example: first row -> col a: 999, 988 + col b: 888, 899 + col c: 1, 2

Edit: Thanks to Correlian I use the following code:

size = 3


data = df.groupby('id').head(2).melt('id').sort_values(by=['id', 'variable'])['value']
out = pd.DataFrame(data.values.reshape((size, -1)))

Corralien · Accepted Answer

Compute value count of id to get the size (number of unique id, here 3) and the smallest count (here 2 for id=1). Group by id and keep the first count row for each group using head.

Then, use melt to flat your dataframe and sort it by id. Finally, reshape your dataframe according the size variable and create a new dataframe:

size, count = df.value_counts('id').agg(['size', 'min'])
data = df.groupby('id').head(count).melt('id').sort_values('id')['value']
out = pd.DataFrame(data.values.reshape((size, -1)))
print(out)

# Output:
     0    1    2    3
0  999  988  888  899
1  721  723  789  745
2  444  423  432  412

Update

I have changed my initial dataframe to 3 columns (a,b,c) see my initial question. How do I have to change your code?

size, count = df.value_counts('id').agg(['size', 'min'])
m = df.groupby('id').head(count).iloc[:, :-1].values
df = pd.DataFrame(np.hstack(m.T.reshape(-1, size, count)))
print(df)

# Output:
     0    1    2    3  4  5
0  999  988  888  899  1  2
1  721  723  789  745  4  5
2  444  423  432  412  6  7

change dataframe to another structure according id

Answers (2)

Related Questions