Search by list of indexes in pandas

Question

I'm trying the following code:

In [29]: indexes_to_search = [1, 3, 4]

In [30]: df = pd.DataFrame([(1, 2, 3), (4, 5, 6), (7, 8, 9)], columns=["id", "val1", "val2"]).set_index("id")

In [31]: df
Out[31]: 
    val1  val2
id            
1      2     3
4      5     6
7      8     9

In [32]: df.loc[indexes_to_search]
Out[32]: 
    val1  val2
id            
1    2.0   3.0
3    NaN   NaN
4    5.0   6.0

For some reason, in the result it was added the index 3 with NaN values in columns. In my real problem indexes_to_search can contain non-index values (Line 3 in my example). I want to avoid add an extra line for drop the nan values sinse my DataFrame is very large. So question is how can I search by a list of index like .loc without the NaN rows. I would expect:

    val1  val2
id            
1    2.0   3.0
4    5.0   6.0

jezrael · Accepted Answer

Need Index.intersection:

df1 = df.loc[df.index.intersection(indexes_to_search)]
print (df1)
   val1  val2
1     2     3
4     5     6

Or use sets intersection:

df1 = df.loc[set(df.index).intersection(indexes_to_search)]
print (df1)
    val1  val2
id            
1      2     3
4      5     6

In my version of pandas 0.22.0 get warning:

df1 = df.loc[indexes_to_search]
print (df1)

    val1  val2
id            
1    2.0   3.0
2    NaN   NaN
3    NaN   NaN

FutureWarning:

Passing list-likes to .loc or [] with any missing label will raise KeyError in the future, you can use .reindex() as an alternative

Search by list of indexes in pandas

Answers (1)

Related Questions