Reputation: 13
Using Pandas, I have a dataframe that looks like this:
col_a col_b col_a1 col_b1
Larry Larry Peter Peter
Lee Lee Jeremy Ilia
I want to compare col_a
to col_b
, and col_a1
to col_b1
. If both pairs match, indicate it in a new column (flag
):
col_a col_b col_a1 col_b1 flag
Larry Larry Peter Peter True
Lee Lee Jeremy Ilia False
How can I do this?
Upvotes: 1
Views: 66
Reputation: 16134
I find the following code to be much simpler to read through.
You just have to compare two columns at a time and and
both the results to get the flag
column:
In one line:
In [18]: tf['flag'] = (tf['col_a'] == tf['col_b']) & (tf['col_a1'] == tf['col_b1'])
In [19]: tf
Out[19]:
col_a col_b col_a1 col_b1 flag
0 Larry Larry Peter Peter True
1 Lee Lee Jeremy Ilia False
Upvotes: 0
Reputation: 17455
You can use DataFrame.eval
:
import pandas as pd
df = pd.DataFrame({
"col_a":["Larry","Lee"],
"col_b":["Larry","Lee"],
"col_a1":["Peter","Jeremy"],
"col_b1":["Peter","Ilia"]
})
print df
df["flag"] = df.eval("col_a==col_b and col_a1==col_b1")
print df
Output:
col_a col_a1 col_b col_b1
0 Larry Peter Larry Peter
1 Lee Jeremy Lee Ilia
col_a col_a1 col_b col_b1 flag
0 Larry Peter Larry Peter True
1 Lee Jeremy Lee Ilia False
If it happens that the columns to be compared are stored in two lists like a_cols
and b_cols
you can do something like:
a_cols = ["col_a","col_a1"]
b_cols = ["col_b","col_b1"]
df["flag"] = df.eval(" and ".join("%s==%s" % pair for pair in zip(a_cols,b_cols)))
print df
Output:
col_a col_a1 col_b col_b1 flag
0 Larry Peter Larry Peter True
1 Lee Jeremy Lee Ilia False
Upvotes: 0
Reputation: 1134
You can use the apply function:
import pandas as pd
df = pd.DataFrame({'col_a':('A','B'), 'col_b':('A','B'), 'col_a1':('C','D'),'col_b1':('C','E')})
df = df[['col_a','col_b','col_a1','col_b1']]
df['flag'] = df.apply(lambda x: ('True' if x['col_a']== x['col_b'] and x['col_a1']==x['col_b1'] else 'False'),axis=1)
print df
Upvotes: 1