Reputation: 35
I have posted a similar question before, but I thought it'd be better to elaborate it in another way. For example, I have a dataframe of compounds assigned to a number, as it follows:
compound,number
17alpha_beta_SID_24898755,8
2_prolinal_109328,3
4_chloro_4491,37
5HT_144234_01,87
5HT_144234_02,2
6-OHDA_153466,23
Also, there is another dataframe with other properties, as well as the compound names, but not only with its corresponding numbers, there are rows in which the compound names are assigned to different numbers - these cases where there are differences are not of interest:
rmsd,chemplp,plp,compound,number
1.00,14.00,-25.00,17alpha_beta_SID_24898755,7
0.38,12.00,-19.00,17alpha_beta_SID_24898755,8
0.66,16.00,-25.6,17alpha_beta_SID_24898755,9
0.87,24.58,-38.35,2_prolinal_109328,3
0.17,54.58,-39.32,2_prolinal_109328,4
0.22,22.58,-32.35,2_prolinal_109328,5
0.41,45.32,-37.90,4_chloro_4491,37
0.11,15.32,-37.10,4_chloro_4491,38
0.11,15.32,-17.90,4_chloro_4491,39
0.61,38.10,-45.86,5HT_144234_01,85
0.62,18.10,-15.86,5HT_144234_01,86
0.64,28.10,-25.86,5HT_144234_01,87
0.64,16.81,-10.87,5HT_144234_02,2
0.14,16.11,-10.17,5HT_144234_02,3
0.14,16.21,-10.17,5HT_144234_02,4
0.15,31.85,-24.23,6-OHDA_153466,23
0.13,21.85,-34.23,6-OHDA_153466,24
0.11,11.85,-54.23,6-OHDA_153466,25
The problem is that I want to find each compound and its corresponding number from dataframe 1 in dataframe 2, and return its entire row.
I was only able to do this (but due to the way the iteration goes in this case, it doesn't work for what I intend to): import numpy as np import csv import pandas as pd
for c1,n1,c2,n2 in zip(df1.compound,df1.number,df2.compound,df2.number):
if c1==c2 and n1==n2:
print(df2[*])
Example: for 17alpha_beta_SID_24898755 (compound) 8 (its number) in dataframe 1, return the row in which this compound and this number is found in dataframe 2. The result should be:
0.38,12.00,-19.00,17alpha_beta_SID_24898755,8
I'd like to do this for all the compounds and its corresponding numbers from dataframe1. The example I gave was only a small set from an extremely extensive list. If anyone could help, thank you!
Upvotes: 1
Views: 55
Reputation: 1878
Take a look at df.merge
method:
df1.merge(df2, on=['compound', 'number'], how='inner')
Upvotes: 1