Reputation: 4190
I'm trying the following code:
In [29]: indexes_to_search = [1, 3, 4]
In [30]: df = pd.DataFrame([(1, 2, 3), (4, 5, 6), (7, 8, 9)], columns=["id", "val1", "val2"]).set_index("id")
In [31]: df
Out[31]:
val1 val2
id
1 2 3
4 5 6
7 8 9
In [32]: df.loc[indexes_to_search]
Out[32]:
val1 val2
id
1 2.0 3.0
3 NaN NaN
4 5.0 6.0
For some reason, in the result it was added the index 3
with NaN
values in columns. In my real problem indexes_to_search
can contain non-index values (Line 3
in my example). I want to avoid add an extra line for drop the nan values sinse my DataFrame is very large.
So question is how can I search by a list of index like .loc
without the NaN rows.
I would expect:
val1 val2
id
1 2.0 3.0
4 5.0 6.0
Upvotes: 1
Views: 1652
Reputation: 862591
Need Index.intersection
:
df1 = df.loc[df.index.intersection(indexes_to_search)]
print (df1)
val1 val2
1 2 3
4 5 6
Or use set
s intersection:
df1 = df.loc[set(df.index).intersection(indexes_to_search)]
print (df1)
val1 val2
id
1 2 3
4 5 6
In my version of pandas 0.22.0 get warning:
df1 = df.loc[indexes_to_search]
print (df1)
val1 val2
id
1 2.0 3.0
2 NaN NaN
3 NaN NaN
FutureWarning:
Passing list-likes to .loc or [] with any missing label will raise KeyError in the future, you can use .reindex() as an alternative
Upvotes: 5