Nasia Ntalla
Nasia Ntalla

Reputation: 1789

Compare columns pandas

I have four columns in two DataFrames and I want to check if id1 = id2 and count1 = count2 in the columns are the same and result to 1 if they match or 0 if they don't. However my code returns only 0. I think it doesn't iterate to one by one and does it in distinct row numbers. I tried to zip the columns I want, but I don't see any difference. Do you have any ideas? Thank you!

import pandas as pd
file1 = 'file1.csv'
file2 = 'file2.csv'

df1 = read_csv(file1)
df1 = read_csv(file2)

id1 = df1['id1']
count1 = df1['count1']
id2 = df2['id2']
count2 = df2['count2']

newresult = pd.concat([id1, count1, id2, count2], axis = 1)
id1 = zip(df1['id1'])
count1 = zip(df1['count1'])

newresult['compare'] = newresult.apply(lambda x: 1 if x['id1'] == x['id2'] and x['count1'] == x['count2'] else 0, axis = 1)

Upvotes: 1

Views: 89

Answers (1)

Euclides
Euclides

Reputation: 287

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0, 2, (50, 4)), columns=["id1", "id2", "count1", "count2"])
df["compare"] = ((df.id1==df.id2) & (df.count1==df.count2)).astype(int)

enter image description here

Upvotes: 2

Related Questions