Reputation: 341

How to convert pandas groups into different columns?

I have a data frame like below.

unit time s1 s2 ....
1    1    2  3
1    2    4  5
1    3    9  7
2    1    5  2
2    2    3  1

I would like to group data by units, keep the minimum similar number of last observations based on time (unit 2 has 2 observations), and make a separate group for the s1 column. So, something like below.

unit_1 unit_2 
   4      5 
   9      3

Thank you.

Upvotes: 0

Answers (3)

Akshay Sehgal

Reputation: 19307

This should solve your problem -

def f(col):
    #First step is to get the last 2 for each group using .tail(2)
    dff = df[['unit','time',col]].sort_values(by=['unit','time'],axis=0).groupby(['unit']).tail(2)

    #Next we need the ordered rank of the time values instead of the actual values of time, 
    #since then we can keep the time values 2,3 as 1,2 and 1,2 as 1,2.
    dff['time'] = dff.groupby(['unit']).rank()

    #Last we pivot over the time and units to get the columns that you need for correlation analysis
    dff = dff.pivot(index='time',columns='unit',values=col).reset_index(drop=True).add_prefix('unit_')
    return dff

f('s1')

unit    unit_1  unit_2
   0         4       5
   1         9       3

Use this function for a faster runtime.

def f(col):
    filt = df[['unit',col]].groupby('unit').tail(2)  #filter last 2
    filt['count'] = filt.groupby('unit').cumcount()  #add a counter column for pivot
    
    #Use counter column as index and unit as column for pivot, then add prefix
    filt = filt.pivot(index='count',columns='unit',values=col).reset_index(drop=True).add_prefix("unit_")
    return filt

Upvotes: 2

wwnde

Reputation: 26686

Groupby unit and pass the list of the nth values.Drop unwanted columns. Transpose dataframe and add prefix unit to the names. Transpose and ravel to combine columns

   g= df.groupby('unit', group_keys=False).nth([-1,-2]).drop(columns=['time','s2']).T.add_prefix('unit_')#.unstack('s1')

final = pd.DataFrame({'unit_1': g['unit_1'].values.T.ravel(),
                    'unit_2': g['unit_2'].values.T.ravel()})
final

    unit_1  unit_2
0       4       5
1       9       3

Upvotes: 1

Roman_N

Reputation: 195

So, I made this solution:

import pandas as pd
import numpy as np

df = pd.DataFrame({'units': [1,1,1,2,2], 's1':[2,4,9,5,3]})

new_df = df.groupby('units').tail(2) # Taking the last 2 values
new_df
Out:
     units s1
    1   1   4
    2   1   9
    3   2   5
    4   2   3


units_list = new_df.units.unique() # How many units do we have?
units_columns = [] # For col names
form_dict = {}
# We have 2 values for each unit, so the number of elements is 2n, 
# where n is a number of unit corresponding the new_df.
n = 0

for unit in units_list:
    units_columns.append('unit_{}'.format(unit))

while n != len(new_df['s1']):
    for col in units_columns:
        form_dict.update({col:new_df['s1'][n:n+2].values})
        n += 2
        
final_df = pd.DataFrame(form_dict)
final_df

And the result is:

 unit_1 unit_2
0   4   5
1   9   3

Upvotes: 1

How to convert pandas groups into different columns?

Answers (3)

Related Questions