Reputation: 574
I'm want to sample n rows from each different value in column named club
columns = ['long_name','age','dob','height_cm','weight_kg','club']
teams = ['Real Madrid','FC Barcelona','Chelsea','CA Osasuna','Paris Saint-Germain','FC Bayern München','Atlético Madrid','Manchester City','Liverpool','Hull City']
playersDataDB = playersData.loc[playersData['club'].isin(teams)][columns]
playersDataDB.head()
In the code above i have selected my desired colums based on them belonging to the teams selected.
The output from this code is a 299 rows × 6 columns Dataframe meaning that i'm sampling all the player from the team but i want to get just 16 of them from each club.
Upvotes: 1
Views: 750
Reputation: 402
Not sure how your dataframe looks like but you could groupby teams and then use head(16) to get only the first 16 of them.
df.groupby('club').head(16)
Upvotes: 2
Reputation: 1307
You can use isin
like this:
playersDataDB = playersData[playersData['club'].isin(teams)]
playersDataDB.head()
Upvotes: 1