Matt C
Matt C

Reputation: 1563

Count specific field in dataframe groupby

I'm new to Python and trying to get my head around how to manipulate Pandas dataframes. I'm using the winemag-data-130k-v2.csv dataset. The fields of interest are 'country','province','winery'variety'.

The first thing I'd like to do is determine the number of wineries per province. I can get as far as reviews_df.groupby(['country','province']).size()

But this gives me the number of rows. (So, 3 if a winery produces 3 varieties). But I want something like a count(distinct winery) in SQL. Suggestions?

Upvotes: 0

Views: 26

Answers (1)

zipa
zipa

Reputation: 27899

What you need is nunique():

reviews_df.groupby(['country','province']).nunique()

Upvotes: 1

Related Questions