Reputation: 419
I have dataset that can be grouped with following:
df.groupby(df.batch.str[:7])
different group sizes are
df.groupby(df.batch.str[:7]).size().unique()
array([1, 2, 3, 4, 5, 6, 7])
lets say I will take any group with size 4 arranged with time column ,
col1 col2 time
0 rt_2345 NaN 1:00
1 rt_2345 3.0 2:00
2 rt_2345 4.0 3:00
3 rt_2345 46.0 4:00
it will have 4 rows, the col2 of each group is NaN and I want to shift 1 row up for every group.
Goal
col1 col2 time
0 rt_2345 3.0 1:00
1 rt_2345 4.0 2:00
2 rt_2345 46.0 3:00
3 rt_2345 NaN 4:00
This can be down with following code with gruop sizes of 2 but not valid for 3,4,5,6,7
.fillna(df.groupby(df.batch.str[:7])['col1'].bfill()))
Upvotes: 1
Views: 813
Reputation: 862441
df['col2'] = df.groupby(df.batch.str[:7])['col2'].shift(-1)
EDIT: If need sorting per multiple columns is possible create column a
by first 7 values of batch and pass to DataFrame.sort_values
:
df = df.assign(a = df.batch.str[:7]).sort_values(['a', 'time'],
ascending=[True, False])
df['col2'] = df.groupby('a')['col2'].shift(-1)
Upvotes: 1
Reputation: 323226
We can try use sorted
with key
in this situation you do not need worry about how many NaN
in the top
df['new'] = df.groupby(df.col1.str[:7])['col2'].apply(lambda x : sorted(x, key=pd.isnull) ).explode().values
df
Out[145]:
col1 col2 time new
0 rt_2345 NaN 1:00 3
1 rt_2345 3.0 2:00 4
2 rt_2345 4.0 3:00 46
3 rt_2345 46.0 4:00 NaN
Upvotes: 1