dsk
dsk

Reputation: 2011

Using Multiple filter condition in a for loop and to get Index Position of a DataFrame once matched

I have two DataFrame like below-
and I want to perform the following -

I want to take all the values from DF-A column TOTAL_NO_OF_RECORDS and want to check with DF-B column -Records as well as DF-A column TOTAL_QUANTITY and want to check with DF-B column -Quantity and if I get a match I need to print the index position of DF-A.

Having said that, please see the

row-0 in DF-A a 22142 309941.01

this is match with DF-B in column number two

22142 309941.01

So, the O/P would be: 0(Index of DF-A) DataFrame -A

#PARTNER_NAME   TOTAL_NO_OF_RECORDS TOTAL_QUANTITY
a   22142   309941.01
b   1719    34301.67
d   4376    345460.7
e   73723   2166729.41
c   2045    165651
f   307 1827993.54
b   1719    34301.67
c   2039    165674

DataFrame-B

Records Quantity
2045    165651
22142   309941.01
17  312
2   300
640 9375.75
68  256
1719    34301.67
2899    219700.8707
451 6658.06
70  3179.24
776 6849.52

in which I want to pull data from DataFrame-B and want to perform a validation between on DataFrame-A column TOTAL_NO_OF_RECORDS and TOTAL_QUANTITY with DatFrame-B (Records Quantity) on each line. If I get a match I want get the INDEX/ROW number where I will be getting the match. I am using the below code

for transaction_count,quantity_coun in zip(df_b.iloc[:,0], df_b.iloc[:,1]):
    indices = df_a.index[df_a['TOTAL_NO_OF_RECORDS'].all() == transaction_count and df_a['TOTAL_QUANTITY'] == quantity_count]

but not getting the proper index position. I want the desired output as- from DataFrame-A index position

print(indices)
O/P: 0
O/P: 1
O/P: 4
O/P: 7

If anyone can help it will be really helpful.

Upvotes: 0

Views: 172

Answers (1)

Jon Clements
Jon Clements

Reputation: 142256

It appears that what you want to do is find rows between A and B where two columns match - for that you can use a .merge. You need to drop the index from DataFrame A such that it's a column and accessible via column selection, eg:

indices = df1.reset_index().merge(
    df2, 
    right_on=['Records', 'Quantity'], 
    left_on=['TOTAL_NO_OF_RECORDS', 'TOTAL_QUANTITY']
)['index'].values

This gives you:

array([0, 1, 6, 4])

Upvotes: 1

Related Questions