Python Pandas new dataframe column with group by and condition

Question

I have a Pandas dataframe that looks as follows.

player  count1  count2
A       1       1
A       2       1
A       3       1
A       4       2
A       5       2
B       1       1
B       2       2
B       3       2
B       4       2

Column player contains names, count1 is a cumulative sum and column count2 contains other counts.

I now want to create a new column that contains the value of count1 where the column count2 first contains the value 2.

Hence, the result should look like this:

player  count1  count2  new
A       1       1       4
A       2       1       4
A       3       1       4
A       4       2       4
A       5       2       4
B       1       1       2
B       2       2       2
B       3       2       2
B       4       2       2

I tried to do it with transform, but I cannot figure out how to combine it with the condition based on the count2 column (and the tanking the value of the count1 column).

Without the groupby it works like this, but I don't know where and how to add the groupby:

df['new'] = df.loc[matches['count2'] == 2, 'count1'].min()

jezrael · Accepted Answer

Use map by Series:

s = df[df['count2'] == 2].drop_duplicates(['player']).set_index('player')['count1']

df['new'] = df['player'].map(s)
print (df)
  player  count1  count2  new
0      A       1       1    4
1      A       2       1    4
2      A       3       1    4
3      A       4       2    4
4      A       5       2    4
5      B       1       1    2
6      B       2       2    2
7      B       3       2    2
8      B       4       2    2

Detail:

First filter only 2 rows by boolean indexing:

print (df[df['count2'] == 2])
  player  count1  count2
3      A       4       2
4      A       5       2
6      B       2       2
7      B       3       2
8      B       4       2

And then remove dupes by player column by drop_duplicates:

print (df[df['count2'] == 2].drop_duplicates(['player']))
  player  count1  count2
3      A       4       2
6      B       2       2

Python Pandas new dataframe column with group by and condition

Answers (1)

Related Questions