aarribas12
aarribas12

Reputation: 574

Select only a number of rows from a pandas Dataframe based on a condition

I'm want to sample n rows from each different value in column named club

enter image description here

columns = ['long_name','age','dob','height_cm','weight_kg','club']
teams = ['Real Madrid','FC Barcelona','Chelsea','CA Osasuna','Paris Saint-Germain','FC Bayern München','Atlético Madrid','Manchester City','Liverpool','Hull City']
playersDataDB = playersData.loc[playersData['club'].isin(teams)][columns]
playersDataDB.head()

In the code above i have selected my desired colums based on them belonging to the teams selected.

The output from this code is a 299 rows × 6 columns Dataframe meaning that i'm sampling all the player from the team but i want to get just 16 of them from each club.

Upvotes: 1

Views: 750

Answers (2)

eduardoftdo
eduardoftdo

Reputation: 402

Not sure how your dataframe looks like but you could groupby teams and then use head(16) to get only the first 16 of them.

df.groupby('club').head(16)

Upvotes: 2

Nike
Nike

Reputation: 1307

You can use isin like this:

playersDataDB = playersData[playersData['club'].isin(teams)]
playersDataDB.head()

Upvotes: 1

Related Questions