Reputation: 149
Let's suppose, there is a dataframe :
df1 =
A B C
0 1 a a1
1 2 b b2
2 3 c c3
3 4 d d4
4 5 e e5
5 6 f f6
Created as :
a1 = [1,2,3,4,5,6]
a2 = ['a','b','c','d','e','f']
a3 = ['a1','b2','c3','d4','e5','f6']
df1 = pd.DataFrame(list(zip(a1,a2,a3)),columns=["A","B","C"])
Here, I am considering Columns A and B to be something like primary keys for this dataframe.
So, PK = ["A","B"]
.
I have another list, list1 = [[2,'b'],[5,'e']]
, which is a subset of the dataframe df[PK]
.
Is there any way I can get the rows corresponding to these primary key values inside the list from the dataframe df?
Something like :
df1 = df[df[PK].values.isin(list1)]
which doesn't work as I expect.
I would like to get an output df1 as :
df1 =
A B C
1 2 b b2
4 5 e e5
There are some similar questions, which I have gone through in this portal. But none of them showed me how to select rows based on filter on multiple columns as mentioned above. Thanks in advance.
Upvotes: 0
Views: 37
Reputation: 27547
Here is how you can use pandas.DataFrame.merge()
:
import pandas as pd
a1 = [1,2,3,4,5,6]
a2 = ['a','b','c','d','e','f']
a3 = ['a1','b2','c3','d4','e5','f6']
df1 = pd.DataFrame(list(zip(a1,a2,a3)),columns=["A","B","C"])
PK = ["A","B"]
list1 = [[2,'b'],[5,'e']]
df2 = df1.merge(pd.DataFrame(list1,columns=PK),on=PK)
print(df2)
Output:
A B C
0 2 b b2
1 5 e e5
Upvotes: 1