Reputation: 317
This is my data:
print(n0data)
FULL_MPID DateTime EquipID count
Index
1 5092761672035390000000000000 2018-11-28 00:36:00 1296 1
2 5092761672035390000000000000 2018-11-28 00:37:00 1634 2
3 5092761672035390000000000000 2018-11-28 13:36:00 1296 3
4 5092761672035390000000000000 2018-11-28 13:38:00 1634 4
5 5092761672035390000000000000 2018-11-29 17:37:00 1290 5
6 5092761672035390000000000000 2018-11-29 17:37:00 1634 6
7 5092761672035390000000000000 2018-11-30 21:23:00 1290 7
8 5092761672035390000000000000 2018-11-30 21:24:00 1634 8
9 5092761672035390000000000000 2018-12-02 09:37:00 1296 9
10 5092761672035390000000000000 2018-12-02 09:39:00 1634 10
11 5092761672035390000000000000 2018-12-02 09:39:00 1634 11
12 5092761672035390000000000000 2018-12-03 11:55:00 1290 12
13 5092761672035390000000000000 2018-12-03 12:02:00 1634 13
14 5092761672035390000000000000 2018-12-06 12:22:00 1290 14
15 5092761672035390000000000000 2018-12-06 12:22:00 1634 15
16 5092761672035390000000000000 2018-12-06 12:22:00 1634 16
17 5092761672035390000000000000 2018-12-06 12:23:00 1634 17
18 5092761672035390000000000000 2018-12-06 12:23:00 1634 18
19 5092761672035390000000000000 2018-12-06 12:23:00 1634 19
20 5092761672035390000000000000 2018-12-06 12:23:00 1634 20
21 5092761672035390000000000000 2018-12-06 12:23:00 1634 21
22 5092761672035390000000000000 2018-12-09 05:51:00 1290 22
So I have a groupBy function that makes the following ecount column with the command:
n0data['ecount'] =
n0data.groupby(['EquipID','FULL_MPID']).cumcount() + 1
The data is sorted by the time and looks to identify when the changeover of EquipID happens.
Ecount is supposed to be:
When the EquipID column values changes from one value to another, ecount should reset. However if EquipID does not change, like during index 15-21 rows, EquipID should continue counting. I thought this was what the groupBy delivered also...
Upvotes: 2
Views: 1093
Reputation: 402263
You can use the shift
and cumsum
trick before groupby
:
v = df.EquipID.ne(df.EquipID.shift())
v.groupby(v.cumsum()).cumcount() + 1
Index
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1
10 1
11 2
12 1
13 1
14 1
15 1
16 2
17 3
18 4
19 5
20 6
21 7
22 1
dtype: int64
Upvotes: 2