SemiQuant
SemiQuant

Reputation: 158

Compare list of values to range and index in python pandas

I'm trying to determine if the ID in one data frame to match those in a second dataframe, and if the values fall within the range contained in the second data frame. I haven't been able to find an answer to this, however, my python isn't strong so I apologize if I missed something that's already out there. Here is an example of what the datframe look like.

import numpy as np
import pandas as pd

df1 = pd.DataFrame({ 'ID' : pd.Series(["A","A","C","C"]),
                    'Pos' : pd.Series([10, 60, 63, 105], dtype='int32')})

df2 = pd.DataFrame({ 'ID' : pd.Series(["A","B","C","C","D"]),
                    'Start' : pd.Series([10, 40, 61, 100, 250], dtype='int32'),
                    'End' : pd.Series([12, 59, 62, 200, 300], dtype='int32')})

so for every row in df1 I would like to check if the ID is contained in df2, and if so, if the "Pos" in df1 falls withing the range of "Start" to "End" in df2. i.e.

for value in df1["Pos"]:
    tmp_start=value >= df2["Start"]
    tmp_end=value <= df2["End"]
    tmp_ID=df1[df1['Pos']==[value]]["ID"].to_string(index=False) == df2["ID"]
    if any(tmp_start.multiply(tmp_end).multiply(tmp_ID)):
        print "Do Something" + value

So the above works but it not very fast and I'm sure theres a better way.

I also tried something like this, but it doesnt check the ID.

def range_test(x):
    return range(x[1], (x[2]+1))

df2 = df2.apply(range_test, axis=1) df2 = [st for row in cov for st in
row] df2 = list(set(df2.sort())) df1['Pos'].isin(df2)

Upvotes: 0

Views: 1439

Answers (1)

shivsn
shivsn

Reputation: 7838

IIUC:

In [34]: df1[(df1.ID.isin(df2.ID))&(df1.index.isin(df2.index))]
Out[34]: 
  ID  Pos
0  A   10
1  A   60
2  C   63
3  C  105

Upvotes: 1

Related Questions