aim
aim

Reputation: 311

adding column based on count and unique count in python

i have a dataframe as shown below.

type item
new apple
new apple
new io
new io
old apple
old io
old io 
old se
old pj
etc el

i need to create a new dataframe based on count and unique count

type    type_count  unique_item_count
new            4    2
old            5    4
etc            1    1

col 'type_count' is based on the frequency of labels in col'type' col 'unique_item_count' is based on the unique count of labels present in col'item' for each unique label in col'type'

also if i add a new column

type    item    val
new apple       20
new apple       6
new io          5
new io          6
old apple       5
old io          6
old io          4
old se          5
old pj          3
etc el          2

and want a new dataframe with

type    type_count  unique_item_count   total_count
new             4                   2   37
old             5                   4   23
etc             1                   1   2

col 'total_count' is sum of amount present in the col'val' for each type

Upvotes: 2

Views: 113

Answers (1)

jezrael
jezrael

Reputation: 862601

Use DataFrameGroupBy.agg with list of tuples - first value specify new column name and second aggregate function, here size and nunique:

L = [('type_count','size'), ('unique_item_count','nunique')]
df = df.groupby('type', sort=False)['item'].agg(L).reset_index()
print (df)
  type  type_count  unique_item_count
0  new           4                  2
1  old           5                  4
2  etc           1                  1

Upvotes: 3

Related Questions