Pandas get column value of first occurrence of condition per group

Question

I have a pandas dataframe as follows:

player  condition   num
A       0           1
A       1           2
A       1           3
B       0           1
B       0           2
B       1           3
B       0           4

I want to add a column that stores the minimum value of the num column where per player the condition column is 1.

The result, hence, should look like this:

player  condition   num  numCondition
A       0           1    2
A       1           2    2
A       1           3    2
B       0           1    3
B       0           2    3
B       1           3    3
B       0           4    3

I know that I need a groupBy() per player. I will then need an apply() with probably using the lambda() function. But I could not fit the pieces together, yet.

EDIT: The condition column is a simplification in my example. In reality it should simply be possible to use the usual pandas dataframe queries to filter the rows. E.g. df[(df.condition == 1) & (df.otherCondition > 10)]

BENY · Accepted Answer

By using drop_duplicates

df.player.map(df[df.condition==1].drop_duplicates(['player'],keep='first').set_index('player').num)
    Out[221]: 
    0    2
    1    2
    2    2
    3    3
    4    3
    5    3
    6    3
    Name: player, dtype: int64

df['numCondition']=df.player.map(df[df.condition==1].drop_duplicates(['player'],keep='first').set_index('player').num)
df
Out[223]: 
  player  condition  num  numCondition
0      A          0    1             2
1      A          1    2             2
2      A          1    3             2
3      B          0    1             3
4      B          0    2             3
5      B          1    3             3
6      B          0    4             3

Pandas get column value of first occurrence of condition per group

Answers (2)

Related Questions