Reputation: 1137
Say I have a dataframe as below:
column_1 | column_2 |
---|---|
1 | car |
2 | truck |
1 | car |
3 | plane |
3 | plane |
2 | truck |
You can clearly see that the column_1
is logically describing the same thing as the column_2
. But my dataset is huge and I can't use a visual inspection to understand this relationship between these 2 columns. How can I check if two columns (as shown in the example) are actually logically the same?
Upvotes: 2
Views: 137
Reputation: 862511
Use factorize
and compare both output arrays by all
for test if all values are True
s:
print (pd.factorize(df['column_1'])[0] == pd.factorize(df['column_2'])[0]).all()
True
Another idea with mapping:
d = df.set_index('column_1')['column_2'].to_dict()
print (df['column_1'].map(d).eq(df['column_2']).all())
Upvotes: 3