srinath
srinath

Reputation: 2998

filling scores to rows based on column value pandas

i have a dataframe like this

Example1:

score_df
========
rid score Flag
1    5     y
1    6     n
1    7     y
1    8     n

i need to copy the scores to the remaining rows when Flag = 'y' appears first. In this example, flag=y appeared on first row when score is 5. i would copy the score 5 to remaining rows. output would look like this.

score_df
========
rid score Flag
1    5     y
1    5     n
1    5     y
1    5     n


Example2:

score_df
========
rid score Flag
1    5     n
1    6     y
1    7     n
1    8     y

in this case, the flag value is y only on the second row where score is 6. output would look like this

score_df
========
rid score Flag
1    5     n
1    6     y
1    6     n
1    6     n

So, i would like to fill the rows based on the first occurrence of flag value. How to do this in pandas without looping ?

Upvotes: 0

Views: 247

Answers (1)

jezrael
jezrael

Reputation: 863116

Create mask for match all values after first y value with compare by Series.eq, cumulative sum by Series.cumsum and compare gor greater like 0 by Series.gt and set first match value to only filtered columns values:

m = df['Flag'].eq('y').cumsum().gt(0)

df.loc[m, 'score'] = df.loc[m, 'score'].iat[0]
print (df)
   rid  score Flag
0    1      5    y
1    1      5    n
2    1      5    y
3    1      5    n

df.loc[m, 'score'] = df.loc[m, 'score'].iat[0]
print (df)
   rid  score Flag
0    1      5    n
1    1      6    y
2    1      6    y
3    1      6    n

Solution if possible no match (no y value in column) is a bit different, next with iter return first value if exist, here it set 0 values by only False mask, so no replace.

This solution also working for data above, so is more general.

df.loc[m, 'score'] = next(iter(df.loc[m, 'score']), 0)
print (df)
   rid  score Flag
0    1      5    n
1    1      6    n
2    1      7    n
3    1      8    n

Upvotes: 2

Related Questions