mommomonthewind
mommomonthewind

Reputation: 4640

pandas: group the continuous rows with same values into one group

Assuming that I have a pandas dataframe of purchase, with no invoice ID like that

item_id customer_id
1 A
2 A
1 B
3 C
4 C
1 A
5 A

So, my assumption is, if multiple items are bought by a customer in continuous orders, they belong to one group. So I would like to create an order_id column as:

item_id customer_id order_id
1 A 1
2 A 1
1 B 2
3 C 3
4 C 3
1 A 4
5 A 4

The order_id shall be created automatically and incremental. How should I do that with pandas?

Many thanks

Upvotes: 2

Views: 518

Answers (1)

Nk03
Nk03

Reputation: 14949

IIUC, here's one way:

df['order_id'] = df.customer_id.ne(df.customer_id.shift()).cumsum()

OUTPUT:

   item_id customer_id  order_id
0        1           A         1
1        2           A         1
2        1           B         2
3        3           C         3
4        4           C         3
5        1           A         4
6        5           A         4

Upvotes: 5

Related Questions