Reputation: 13
df1-
ID Name Number
0 AAA 123
1 BBB 456
2 CCC 789
df2-
ID Name Number
0 AAA 123
1 BBB 456
2 CCC **963** <----- Non Matching value
want to compare above two data frames df1 and df2 want the result in below format: need only non matching value with column name.
expected output:
ID Number
2 963
can anyone help me with the code i am new to pandas, please help me out Thanks you soo much...
Upvotes: 1
Views: 7497
Reputation: 5191
You can Extract the data whatever you want from the output, which has the details of all mismatches
Upvotes: 0
Reputation: 23217
You can use .merge()
with indicator turned on and filter the result by the indicator, as follows:
df3 = df2.merge(df1, how='left', indicator=True)
df3[df3['_merge'] == 'left_only'][['ID', 'Number']]
Result:
ID Number
2 2 963
If you have multiple columns and would not like to specify the column names to highlight the differences, you can try:
df2[(df1 != df2)].dropna(how='all', axis=1).dropna(how='all', axis=0)
df1
ID Name Number1 Number2 Number3
0 0 AAA 123 12 1111
1 1 BBB 456 22 2222
2 2 CCC 789 32 3333
df2
ID Name Number1 Number2 Number3
0 0 AAA 123 12 1111
1 1 BBB 456 22 2255
2 2 CCC 963 32 3333
df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0)
Number1 Number3
1 NaN 2255.0
2 963.0 NaN
You can see from the non_NaN values the differences. The ID
is the index on the left.
If your numbers are all integers and you don't want Pandas to show the integers as float type together with NaN
values, you can use:
df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).fillna('').astype(str).replace(r'\.0', '', regex=True)
Number1 Number3
1 2255
2 963
Or, simply use:
df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).astype('Int64')
Number1 Number3
1 <NA> 2255
2 963 <NA>
Upvotes: 3
Reputation: 627
You can use the following
df2[df1.Number != df2.Number][['ID', 'Number']]
Upvotes: 0