Jesse Jin
Jesse Jin

Reputation: 21

Creating New Column with value of other columns based on condition on a third Column in Pandas DataFrame

My dataframe looks like this

category value1 value2 
A.       20.      30.
B.       40.      50.
A.       60.      70. 
B.       80.      90.
C.       10.      10. 
D.       20.      20.  

I want to create a new column that has the value of either value1 or value2 based on a condition relative to category. For example, if category is A then store value1, if category is B then store value2,if else then store nan. I expect an output like this:

category value1 value2 new_col
A.       20.      30.   20.
B.       40.      50.   50.
A.       60.      70.   60.  
B.       80.      90.   90.
C.       10.      10.   nan
D.       20.      20.   nan
 

How can I do that?

Upvotes: 1

Views: 4665

Answers (3)

PaulS
PaulS

Reputation: 25313

Another possible solution, based on numpy.where:

df['new_col'] = np.where(df.category.eq('A.'), df.value1, np.where(
    df.category.eq('B.'), df.value2, np.nan))

Output:

  category  value1  value2  new_col
0       A.    20.0    30.0     20.0
1       B.    40.0    50.0     50.0
2       A.    60.0    70.0     60.0
3       B.    80.0    90.0     90.0
4       C.    10.0    10.0      NaN
5       D.    20.0    20.0      NaN

Yet another possible solution, based on pandas.DataFrame.update:

df['new_col'] = df.value1.loc[df.category.eq('A.')]
df['new_col'].update(df.value2.loc[df.category.eq('B.')])

Output:

# same

Upvotes: 1

sammywemmy
sammywemmy

Reputation: 28659

One option is with case_when from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

(df
.case_when(
    df.category.eq('A.'), df.value1, # condition, result
    df.category.eq('B.'), df.value2,  
    np.nan, # default
    column_name = 'new_col')
)

  category  value1  value2  new_col
0       A.    20.0    30.0     20.0
1       B.    40.0    50.0     50.0
2       A.    60.0    70.0     60.0
3       B.    80.0    90.0     90.0
4       C.    10.0    10.0      NaN
5       D.    20.0    20.0      NaN

Upvotes: 0

Naveed
Naveed

Reputation: 11650

# np.select and define list of condition with corresponding values
df['value3']=(np.select([df['category'].eq('A.'), # condition #1
                         df['category'].eq('B.')],# condition #2
                        [df['value1'],            # value when #1 is true
                         df['value2']],           # value when #2 is true
                        np.nan))                  # default value
df
    category    value1  value2  value3
0          A.   20.0    30.0    20.0
1          B.   40.0    50.0    50.0
2          A.   60.0    70.0    60.0
3          B.   80.0    90.0    90.0
4          C.   10.0    10.0    NaN
5          D.   20.0    20.0    NaN

Upvotes: 2

Related Questions