cosmosa
cosmosa

Reputation: 761

Pandas: number of subgroups per index

I have a dataframe with index index1 and values val1 and val2. I'm trying to return the count of unique val1 values for each index1.

DataFrame:

df = pd.DataFrame(columns=['index1', 'val1', 'val2'], data=[['A', 1, 1], ['A', 1, 1], ['A', 2, 1]])

df = df.set_index(['index1'])

I group like this

groupby = df.groupby([df.index, 'val1']) 

Then, I call size(), which returns

    index1  val1
    A       1       2
            2       1
    dtype: int64

This returns the count for each group. I am looking for the number of groups each index1 value has. I.e A has 2 unique groups.

Upvotes: 1

Views: 943

Answers (2)

jezrael
jezrael

Reputation: 862511

I think you need SeriesGroupBy.nunique if need count unique value in some column per groups:

df1 = df.groupby(level=0)['val1'].nunique() 
print (df1)
index1
A    2
Name: val1, dtype: int64

df1 = df.groupby(level=0)['val1'].nunique().reset_index().rename(columns={'val1':'uniq'})
print (df1)
  index1  uniq
0      A     2

And if need count unique values in all columns use agg with nunique:

df1 = df.groupby(level=0).agg(lambda x: x.nunique())
print (df1)
        val1  val2
index1            
A          2     1

Upvotes: 0

Ted Petrou
Ted Petrou

Reputation: 61947

If you want the total number of unique items for each column you can do the following

df.groupby(level=0).agg(lambda x: len(x.unique()))

        val1  val2
index1            
A          2     1

Upvotes: 1

Related Questions