MehmedB
MehmedB

Reputation: 1137

How to check if two columns are logically the same in pandas?

Say I have a dataframe as below:

column_1 column_2
1 car
2 truck
1 car
3 plane
3 plane
2 truck

You can clearly see that the column_1 is logically describing the same thing as the column_2. But my dataset is huge and I can't use a visual inspection to understand this relationship between these 2 columns. How can I check if two columns (as shown in the example) are actually logically the same?

Upvotes: 2

Views: 137

Answers (1)

jezrael
jezrael

Reputation: 862511

Use factorize and compare both output arrays by all for test if all values are Trues:

print (pd.factorize(df['column_1'])[0] == pd.factorize(df['column_2'])[0]).all()
True

Another idea with mapping:

d = df.set_index('column_1')['column_2'].to_dict()
print (df['column_1'].map(d).eq(df['column_2']).all())

Upvotes: 3

Related Questions