Reputation: 247
Assume I have two dataframe containing hundreds cols and rows, I would like compare them based on the same row and column (row and column-wise). For example,
df1 = pd.DataFrame({
'Place' : ['A', 'B', 'C','D'],
'Peter' : [4,5,1.2,7],
'John' : [1,0,3,5],
})
df1_1 = df1.set_index('Place')
df2 = pd.DataFrame({
'Place' : ['A', 'B', 'C','D'],
'Peter' : ['NA',5,1.2,8.5],
'John' : [1,0,3,5],
})
df2_2 = df2.set_index('Place')
For Peter
column in df1_1
and df2_2
, Row B and C are the same, but others are not,
so the commonplace in Peter column is (2/4) = 0.5
and so on in John column is (4/4) = 1.00
Does any elegant way to do it using pandas?
Upvotes: 0
Views: 34
Reputation: 61947
You should be able to do (df1 == df2).mean()
which will automatically align each column and make each value a boolean. Taking the mean will return the percentage matched.
Your dataframes need to be identically labeled.
Output
John 1.0
Peter 0.5
dtype: float64
Upvotes: 4