Reputation: 55
I have a dataframe that looks like below:
|userid|rank2017|rank2018|
|212 |'H' |'H' |
|322 |'L' |'H |
|311 |'H' |'L' |
I want to create a new column called progress in the the dataframe above that will output 1 if rank2017 is equal to rank2018, 2 if rank2017 is 'H' and rank2018 is 'L' else 3. can anybody help me execute this in python
Upvotes: 3
Views: 1832
Reputation: 51395
Here is a way using np.select
:
# Set your conditions:
conds = [(df['rank2017'] == df['rank2018']),
(df['rank2017'] == 'H') & (df['rank2018'] == 'L')]
# Set the values for each conditions
choices = [1, 2]
# Use np.select with a default of 3 (your "else" value)
df['progress'] = np.select(conds, choices, default = 3)
Returns:
>>> df
userid rank2017 rank2018 progress
0 212 H H 1
1 322 L H 3
2 311 H L 2
Upvotes: 3
Reputation: 164773
Here is one way. You do not need to use nested if statements.
df = pd.DataFrame({'user': [212, 322, 311],
'rank2017': ['H', 'L', 'H'],
'rank2018': ['H', 'H', 'L']})
df['progress'] = 3
df.loc[(df['rank2017'] == 'L') & (df['rank2018'] == 'H'), 'progress'] = 2
df.loc[df['rank2017'] == df['rank2018'], 'progress'] = 1
# rank2017 rank2018 user progress
# 0 H H 212 1
# 1 L H 322 2
# 2 H L 311 3
Upvotes: 3