Luiz Matias
Luiz Matias

Reputation: 49

Is there a way to subtract a column values according to other column?

I have this dataframe:

  name     color   number
0  john     red       4
1   ana     red       4
2   ana     red       5
3  paul     red       6
4  mark     red       3
5   ana  yellow      10
6  john  yellow      11
7  john  yellow      12
8  john     red      13

If the value in color column change (according to the name column), I want to create another column with the subtraction between the last value associated to the color and the first value from the new color. If the value in color column don't change, return -999. Ex: Looking to ana, the last value for red is 5 and the first value for yellow is 10. So, the new column will be 10 - 5 = 5 to ana. Looking to john, the last value for red is 4 and the first value for yellow is 11. So, the new column will be 11 - 4 = 7 to john. Do that just one time. If the color change again, it doesn't mather.

I want this output:

   name   color  number  difference
0  john     red       4           7
1   ana     red       4           5
2   ana     red       5           5
3  paul     red       6        -999
4  mark     red       3        -999
5   ana  yellow      10           5
6  john  yellow      11           7
7  john  yellow      12           7
8  john     red      13           7

please, somebody help me?

Upvotes: 1

Views: 114

Answers (1)

Marco Cerliani
Marco Cerliani

Reputation: 22031

try in this way

df = pd.DataFrame({'name':['john','ana','ana','paul','mark','ana','john','john','john'],
                   'color':['red','red','red','red','red','yellow','yellow','yellow','red'],
                   'number':[4,4,5,6,3,10,11,12,13]})

df['color_code'] = df['color'].factorize()[0]

partial_df = pd.DataFrame()
partial_df['difference'] = df.groupby('name')['number'].apply(lambda x: list(np.diff(x))).explode()
partial_df['change_status'] = df.groupby('name')['color_code'].apply(lambda x: list((np.diff(x)>0)+0)).explode()

map_difference = partial_df.loc[partial_df.change_status != 0].reset_index().drop_duplicates('name').set_index('name')['difference']
df['difference'] = df.name.copy().map(map_difference).fillna(-999)
df

enter image description here

Upvotes: 2

Related Questions