Reputation: 597
i have a df which I want to add a column that shows the student who is place (1) from the group('subject', 'class') and update the column after there is a new place (1).
code:
data = {
"subject": ['eng','math','math','math','math','math','math','math','math','math','math','math','math','eng','eng'],
"class": ['Class_4','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3',
'Class_3','Class_3','Class_3','Class_3','Class_4','Class_4'],
"student": ['henry','pan','larry','larry','henry','larry','terry','henry','henry',
'henry','pan','pan','moose','pan','moose'],
"place": [7,8,10,1,7,10,9,7,11,1,11,3,6,2,4]}
df = pd.DataFrame(data)
╔═════════╦═════════╦═════════╦═══════╗ ║ subject ║ class ║ student ║ place ║ ╠═════════╬═════════╬═════════╬═══════╣ ║ eng ║ Class_4 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ pan ║ 8 ║ ║ math ║ Class_3 ║ larry ║ 10 ║ ║ math ║ Class_3 ║ larry ║ 1 ║ ║ math ║ Class_3 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ larry ║ 10 ║ ║ math ║ Class_3 ║ terry ║ 9 ║ ║ math ║ Class_3 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ henry ║ 11 ║ ║ math ║ Class_3 ║ henry ║ 1 ║ ║ math ║ Class_3 ║ pan ║ 11 ║ ║ math ║ Class_3 ║ pan ║ 3 ║ ║ math ║ Class_3 ║ moose ║ 6 ║ ║ eng ║ Class_4 ║ pan ║ 2 ║ ║ eng ║ Class_4 ║ moose ║ 4 ║ ╚═════════╩═════════╩═════════╩═══════╝
result trying to get
╔═════════╦═════════╦═════════╦═══════╦═════════╗ ║ subject ║ class ║ student ║ place ║ new_col ║ ╠═════════╬═════════╬═════════╬═══════╬═════════╣ ║ eng ║ Class_4 ║ henry ║ 7 ║ nil ║ ║ math ║ Class_3 ║ pan ║ 8 ║ nil ║ ║ math ║ Class_3 ║ larry ║ 10 ║ nil ║ ║ math ║ Class_3 ║ larry ║ 1 ║ nil ║ ║ math ║ Class_3 ║ henry ║ 7 ║ larry ║ ║ math ║ Class_3 ║ larry ║ 10 ║ larry ║ ║ math ║ Class_3 ║ terry ║ 9 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 7 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 11 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 1 ║ larry ║ ║ math ║ Class_3 ║ pan ║ 11 ║ henry ║ ║ math ║ Class_3 ║ pan ║ 3 ║ henry ║ ║ math ║ Class_3 ║ moose ║ 6 ║ henry ║ ║ eng ║ Class_4 ║ pan ║ 2 ║ nil ║ ║ eng ║ Class_4 ║ moose ║ 4 ║ nil ║ ╚═════════╩═════════╩═════════╩═══════╩═════════╝
kindly advise. thank you
Upvotes: 0
Views: 68
Reputation: 71689
Mask the values in place
column which are not equal to 1
then group
the masked column by subject
and class
and use ffill
to forward fill the values
df['new_col'] = df['student'].mask(df['place'] != 1)
df['new_col'] = df.groupby(['subject', 'class'])['new_col'].ffill()
print(df)
subject class student place new_col
0 eng Class_4 henry 7 NaN
1 math Class_3 pan 8 NaN
2 math Class_3 larry 10 NaN
3 math Class_3 larry 1 larry
4 math Class_3 henry 7 larry
5 math Class_3 larry 10 larry
6 math Class_3 terry 9 larry
7 math Class_3 henry 7 larry
8 math Class_3 henry 11 larry
9 math Class_3 henry 1 henry
10 math Class_3 pan 11 henry
11 math Class_3 pan 3 henry
12 math Class_3 moose 6 henry
13 eng Class_4 pan 2 NaN
14 eng Class_4 moose 4 NaN
Upvotes: 2