Reputation: 21552
Starting from a simple dataframe df
like:
C,n
AAA,1
AAA,2
BBB,1
BBB,2
CCC,1
CCC,2
DDD,1
DDD,2
I would like to add a column based on some conditions on values in the C
column. The column I would like to add is:
df['H'] = df['n'] / 10
which returns:
C n H
0 AAA 1 0.1
1 AAA 2 0.2
2 BBB 1 0.1
3 BBB 2 0.2
4 CCC 1 0.1
5 CCC 2 0.2
6 DDD 1 0.1
7 DDD 2 0.2
Now I would like to add the same column but with a different normalization factor only for values CCC
and DDD
in column C
, as, for instance:
df['H'] = df['n'] / 100
so that:
C n H
0 AAA 1 0.1
1 AAA 2 0.2
2 BBB 1 0.1
3 BBB 2 0.2
4 CCC 1 0.01
5 CCC 2 0.02
6 DDD 1 0.01
7 DDD 2 0.02
So far I tried to mask the dataframe as:
mask = df['C'] == 'CCC'
df = df[mask]
df['H'] = df['n'] / 100
and that worked on the masked sample. But since I have to apply several filters keeping the original H
column for non-filtered values I'm getting confused.
Upvotes: 2
Views: 1830
Reputation: 6276
Using the examples in this answer you can use:
df['H'][mask] = df['H'][mask]/100
You could also calculate the H column separately based ('CCC'/'DDD' or not 'CCC'/'DDD'):
import numpy as np
mask = np.logical_or(df['C'] == 'CCC', df['C']=='DDD')
not_mask = np.logical_not(mask)
df['H'][not_mask] = df['H'][not_mask]/10
df['H'][mask] = df['H'][mask]/100
Upvotes: 1
Reputation: 1181
Can can also use iloc
df.ix[df['C'].isin(['CCC','DDD']),['H']] = df['n'] / 100
Upvotes: 2