Reputation: 4640
Assuming that I have a pandas dataframe of purchase, with no invoice ID like that
item_id customer_id
1 A
2 A
1 B
3 C
4 C
1 A
5 A
So, my assumption is, if multiple items are bought by a customer in continuous orders, they belong to one group. So I would like to create an order_id column as:
item_id customer_id order_id
1 A 1
2 A 1
1 B 2
3 C 3
4 C 3
1 A 4
5 A 4
The order_id shall be created automatically and incremental. How should I do that with pandas?
Many thanks
Upvotes: 2
Views: 518
Reputation: 14949
IIUC, here's one way:
df['order_id'] = df.customer_id.ne(df.customer_id.shift()).cumsum()
OUTPUT:
item_id customer_id order_id
0 1 A 1
1 2 A 1
2 1 B 2
3 3 C 3
4 4 C 3
5 1 A 4
6 5 A 4
Upvotes: 5