Ahsan Mukhtar
Ahsan Mukhtar

Reputation: 649

Python pandas dataframe aggregation output

I have data like like this:

Sr.No   Value1  Value2  Value3  Type
    1   2       1        N      A
    2   3       1        M      A
    3   3       2        X      A
    4   0       1        Y      B
    5   10      2        Z      B
    6   3       4        J      A
    7   5       6        K      A

How can i generate the output like this using Pandas Python.

Sr.No   Value1  Value2  Value3     Type
1       8       4       N,M,X       A
2       10      3       X,Y         B
3       8       10      Z,J,K       A

Basically i am looking for doing grouping based on consecutive value1, value 2 sum and type. How can i achieve this?

I have attempted it with this code, but it is not returning the required output.

grpbook = pandabook.groupby(['Value1','Value2','Type']).agg({'Value1': 'sum','Value2': 'sum', 'Value3': lambda ind: ','.join(ind) })

The Type Value A is divided into two groups based on index and sequence.

Upvotes: 2

Views: 136

Answers (1)

Chris
Chris

Reputation: 29742

You can start with cumsum to generate pseudo group-id:

df["group_id"] = (df.Type != df.Type.shift()).cumsum()
0    1
1    1
2    1
3    2
4    2
5    3
6    3

And use it to divide df into groups:

df = df.groupby(["group_id", "Type"]).sum().reset_index().drop("group_id", 1)
df["Sr.No"] = df.index+1
print(df)
  Type  Sr.No  Value1  Value2
0    A      1       8       4
1    B      2      10       3
2    A      3       8      10

Upvotes: 4

Related Questions