Reputation: 1042
I have two different DataFrame named df1 and df2, with same id column, but some ids have the same count and some ids have the different counts, so I want to get the data for same ids with the different count values, and both DataFrames have the different indexes
following is my df1
id valueA
0 255 1141
1 91 1130
2 347 830
3 30 757
4 68 736
5 159 715
6 32 713
7 110 683
8 225 638
9 257 616
my df2 is
id valueB
0 255 1231
1 91 1170
2 5247 954
3 347 870
4 30 757
5 68 736
6 159 734
7 32 713
8 110 683
9 225 644
10 257 616
11 917 585
12 211 575
13 25 530
how can I do that?
Upvotes: 2
Views: 2985
Reputation: 153460
Use merge
and filter dataframe with query
:
df1.merge(df2, on='id').query('valueA != valueB')
Output:
id valueA valueB
0 255 1141 1231
1 91 1130 1170
2 347 830 870
5 159 715 734
8 225 638 644
Same method different syntax:
df_out = df1.merge(df2, on='id')
df_out[df_out['valueA'] != df_out['valueB']]
In response to commment below:
Sure you can use set_index and let pandas handle alignment.
df1 = df1.set_index('id')
df2 = df2.set_index('id')
df_diff = df1['valueA'] - df2['valueB']
df_diff = df_diff[df_diff.notnull() & df_diff.ne(0)]
pd.concat([df1.reindex(df_diff.index), df2.reindex(df_diff.index)], axis=1)
Output:
valueA valueB
id
91 1130 1170
159 715 734
225 638 644
255 1141 1231
347 830 870
Upvotes: 2