Reputation: 311
i have a dataframe as shown below.
type item
new apple
new apple
new io
new io
old apple
old io
old io
old se
old pj
etc el
i need to create a new dataframe based on count and unique count
type type_count unique_item_count
new 4 2
old 5 4
etc 1 1
col 'type_count' is based on the frequency of labels in col'type' col 'unique_item_count' is based on the unique count of labels present in col'item' for each unique label in col'type'
also if i add a new column
type item val
new apple 20
new apple 6
new io 5
new io 6
old apple 5
old io 6
old io 4
old se 5
old pj 3
etc el 2
and want a new dataframe with
type type_count unique_item_count total_count
new 4 2 37
old 5 4 23
etc 1 1 2
col 'total_count' is sum of amount present in the col'val' for each type
Upvotes: 2
Views: 113
Reputation: 862601
Use DataFrameGroupBy.agg
with list of tuples - first value specify new column name and second aggregate function, here size
and nunique
:
L = [('type_count','size'), ('unique_item_count','nunique')]
df = df.groupby('type', sort=False)['item'].agg(L).reset_index()
print (df)
type type_count unique_item_count
0 new 4 2
1 old 5 4
2 etc 1 1
Upvotes: 3