Don Coder
Don Coder

Reputation: 556

How can i compare a DataFrame with other DataFrame's columns?

I have two different DataFrame's. df is the complete one and sample is for the comparing. Here is the data i have:

sample.tail()
   T1  C   C1   C2   C3
0   1  5  0.0  7.0  5.0

df.tail()
   T1  T2  C   C1   C2   C3
0   1   0  5  4.0  6.0  6.0
1   0   0  5  5.0  4.0  6.0
2   0   1  7  5.0  5.0  4.0
3   1   1  0  7.0  5.0  5.0
4   1   1  5  0.0  7.0  5.0

I have selected some columns from sample df and trying to find values in df matches the sample

Here what i did so far but no luck:

cols = sample.columns
df = df[df[cols] == sample[cols]]

and i am getting the following error:

ValueError: Can only compare identically-labeled DataFrame objects

Can you kindly help me to findout the solution for this?

EDIT: Expected output

df.tail()
   T1  T2  C   C1   C2   C3
0   1   0  5  0.0  7.0  5.0
21  1   1  5  0.0  7.0  5.0
27  1   0  5  0.0  7.0  5.0
34  1   1  5  0.0  7.0  5.0
42  1   1  5  0.0  7.0  5.0
47  1   0  5  0.0  7.0  5.0
51  1   1  5  0.0  7.0  5.0

You can see that All data matches with sample dataframe except T2. This is expected output for me

Thanks

Upvotes: 2

Views: 49

Answers (1)

Ami Tavory
Ami Tavory

Reputation: 76297

Using pd.Index.intersection, you can use

cols = sample.columns.intersection(df.columns)
df[df[cols].apply(tuple, axis=1).isin(sample[cols].apply(tuple, axis=1))]

Upvotes: 1

Related Questions