Reputation: 1204
I want to create a function that will create a new column based on certain conditions in two other columns in one pass. Here's a sample dataframe:
col_a col_b col_c
abc abc jkl
def def mno
ghi pqr pqr
So the resulting dataframe I would like is:
col_a col_b col_c col_d
abc abc jkl abc
def def mno def
ghi xyz pqr ghipqr
Assuming there are more columns and rows in this dataframe, I would like to create a function that if col_a
== col_b
, col_d
== col_a
. If col_a
!= col_b
, then col_d
= col_a
+ col_c
. These columns are all strings.
I tried something like this:
list_1 = df['col_a']==df['col_b']
if list_1 is True:
pass
elif list_1 is False:
df['col_d'] = df['col_a'] + df['col_c']
But this didn't seem to work, and it added everything, regardless of if the condition was True
or False
. Any help would be appreciated!
Upvotes: 3
Views: 70
Reputation: 19957
df['col_d'] = np.where(df.col_a==df.col_b, df.col_a, df.col_a+df.col_c)
or
df['col_d'] = df.col_a + np.where(df.col_a.eq(df.col_b), '', df.col_c)
Upvotes: 3
Reputation: 29752
IIUC, you can use pandas.Series.where
instead:
df["col_d"] = s.where(s.eq(df["col_b"]), s + df["col_c"])
print(df)
Output:
col_a col_b col_c col_d
0 abc abc jkl abc
1 def def mno def
2 ghi pqr pqr ghipqr
Upvotes: 2