Reputation:
I am trying to sort a pandas
df
based on specific values. So for the pandas
df
below I want to select values A, C
in Column Event
. I also want to select values U,Z
in Column Code
import pandas as pd
d = ({
'Event' : ['A','B','C','D','E','A','B','C','D'],
'Code' : ['W','X','Y','U','Z','X','Y','W','Z'],
'Int' : [1,2,3,4,5,6,7,8,9]
})
df = pd.DataFrame(data = d)
I can do it via one column:
df = df.loc[df['Event'].isin(['A','C'])]
But if I try to include the second Column
df = df.loc[df['Code'].isin(['U','Z'])]
It returns an empty df. My intended df is:
Event Code Int
0 A W 1
1 C Y 3
2 D U 4
3 E Z 5
4 A X 6
5 C W 8
6 D Z 9
Upvotes: 1
Views: 1988
Reputation: 264
Here you can call both values from diferent colums at the same time.
new_df=df[(df["event"] == 'A') & (df["code"] == "u") ]
Upvotes: 0
Reputation: 59
What's happening here is u are first selecting the rows with A,C and in that u are trying to search for columns with U and Z. But if you notice, none of the rows with A,C in Event have a value of U and Z in code column. That is the reason you are getting an empty dataframe.
Try the below:
newdf = df.query("Event in ['A','C'] | Code in ['U','Z']")
newdf
Event Code Int
0 A W 1
2 C Y 3
3 D U 4
4 E Z 5
5 A X 6
7 C W 8
8 D Z 9
Upvotes: 1
Reputation: 1233
One possible solution.
df[(df.Code.isin(['U','Z'])) | (df.Event.isin(['A', 'C']))]
Upvotes: 0
Reputation: 12417
I think you need:
df = df.loc[df['Event'].isin(['A','C']) | df['Code'].isin(['U','Z'])].reset_index(drop=True)
Output:
Code Event Int
0 W A 1
1 Y C 3
2 U D 4
3 Z E 5
4 X A 6
5 W C 8
6 Z D 9
Upvotes: 2