Derek
Derek

Reputation: 59

Python / Pandas: Dataframe multiple filters

I have the following dataframe which is list of races and results.

      Date       R   H   Fin  Win
0     11182017   1   1   2     0 
1     11182017   1   2   1     5   
2     11182017   1   3   3     0   
3     11182017   2   1   2     0   
4     11182017   2   2   1     10   
5     11182017   3   1   1     6    
6     11182017   3   2   2     0   

I want to be able to filter first by race (R) then by horse (H) and return the results.

For the dataset above when R = 1 I want to see only H 1 & 2 (ie filter out H 3 for R 1). For R = 2 I want to to see only H 1 (ie filter out H 2 for R 2) and for R =3 I only want to see H 2 (ie filter out H 1 for R 3). I would accomplish this by setting a variable preferably via a prompt. Essentially I am trying to "scratch" certain horses from a particular race

Results Example below

      Date       R   H   Fin  Win
0     11182017   1   1   2     0 
1     11182017   1   2   1     5    
3     11182017   2   1   2     0   
6     11182017   3   2   2     0   

Upvotes: 1

Views: 88

Answers (2)

DJK
DJK

Reputation: 9274

A dictionary is an option for storing races and horse in that you would not need to create unnecessary variables every time you run a new filtering setup, here I use a dictionary with keys = races and values = list of horses

d = {1:[1,2],2:[1]}

Now you can use the query function, which accepts a query string. This can be made by iterating over the dictionary

query_str = ' | '.join(['((R == {x}) and (H in {y}))'.format(x=x,y=y) for x,y in d.items()])

The string generated with list comp would look like this in this example

((R == 1) and (H in [1, 2])) | ((R == 2) and (H in [1]))

Now you can run

df.query(query_str)

and get

       Date  Fin  H  R  Win
0  11182017    2  1  1    0
1  11182017    1  2  1    5
3  11182017    2  1  2    0

Additional Notes on User input to create dictionary

Code

d = {}
x = input('Add races ')
for i in x:
    d[i] = list(input('Add Horses for Race: {} '.format(i)))
print d

Sample Run

Add races 1,2,3
Add Horses for Race: 1 1,2
Add Horses for Race: 2 2,3
Add Horses for Race: 3 3,4
{1: [1, 2], 2: [2, 3], 3: [3, 4]}

Upvotes: 1

rafaelc
rafaelc

Reputation: 59304

IIUC, you want someone to input R and H to get values... You can use input in python3.

races = input("Enter races: ").split(",")
ind = []
for race in races:
    sub = df[df["R"] == int(race)]
    horses = input("Enter horses to show for race {0}: ".format(race)).split(",")
    rows = sub.H.isin(horses)
    ind.extend(rows[rows].index.values)

print(df[df.index.isin(ind)])

Example:

Enter races: 1,2
Enter horses for race 1: 1,2
Enter horses for race 2: 2
   R  H
0  1  1
1  1  2
4  2  2

Upvotes: 0

Related Questions