Reputation: 289
In Pandas Dataframe, I input data unique key.
For Example Inputs:
time range
2018-03-04 00:00:06.520 0
2018-03-04 00:00:07.130 0
2018-03-04 00:00:07.850 1
2018-03-04 00:00:08.420 1
2018-03-04 00:00:09.210 2
2018-03-04 00:00:10.070 2
2018-03-04 00:00:10.840 3
2018-03-04 00:00:11.230 3
2018-03-04 00:00:11.980 4
2018-03-04 00:00:12.560 4
2018-03-04 00:00:13.120 0
2018-03-04 00:00:13.790 0
2018-03-04 00:00:14.330 1
2018-03-04 00:00:15.280 1
2018-03-04 00:00:15.960 2
2018-03-04 00:00:16.420 2
2018-03-04 00:00:17.090 3
I wanna Output dataFrame is here.
time range Key
2018-03-04 00:00:06.520 0 1
2018-03-04 00:00:07.130 0 1
2018-03-04 00:00:07.850 1 1
2018-03-04 00:00:08.420 1 1
2018-03-04 00:00:09.210 2 1
2018-03-04 00:00:10.070 2 1
2018-03-04 00:00:10.840 3 1
2018-03-04 00:00:11.230 3 1
2018-03-04 00:00:11.980 4 1
2018-03-04 00:00:12.560 4 1
2018-03-04 00:00:13.120 0 2
2018-03-04 00:00:13.790 0 2
2018-03-04 00:00:14.330 1 2
2018-03-04 00:00:15.280 1 2
2018-03-04 00:00:15.960 2 2
2018-03-04 00:00:16.420 2 2
2018-03-04 00:00:17.090 3 2
...
I wanna using range & Time get Key Values that increasing..
How can I do it?
Upvotes: 1
Views: 83
Reputation: 294488
diff
df.assign(Key=df.range.diff().lt(0).cumsum().add(1))
time range Key
0 2018-03-04 00:00:06.520 0 1
1 2018-03-04 00:00:07.130 0 1
2 2018-03-04 00:00:07.850 1 1
3 2018-03-04 00:00:08.420 1 1
4 2018-03-04 00:00:09.210 2 1
5 2018-03-04 00:00:10.070 2 1
6 2018-03-04 00:00:10.840 3 1
7 2018-03-04 00:00:11.230 3 1
8 2018-03-04 00:00:11.980 4 1
9 2018-03-04 00:00:12.560 4 1
10 2018-03-04 00:00:13.120 0 2
11 2018-03-04 00:00:13.790 0 2
12 2018-03-04 00:00:14.330 1 2
13 2018-03-04 00:00:15.280 1 2
14 2018-03-04 00:00:15.960 2 2
15 2018-03-04 00:00:16.420 2 2
16 2018-03-04 00:00:17.090 3 2
flatnonzero
and repeat
a = np.diff(np.flatnonzero(np.concatenate(
[[True], np.diff(df.range.values) < 0, [True]]
)))
df.assign(Key=np.arange(a.size).repeat(a) + 1)
time range Key
0 2018-03-04 00:00:06.520 0 1
1 2018-03-04 00:00:07.130 0 1
2 2018-03-04 00:00:07.850 1 1
3 2018-03-04 00:00:08.420 1 1
4 2018-03-04 00:00:09.210 2 1
5 2018-03-04 00:00:10.070 2 1
6 2018-03-04 00:00:10.840 3 1
7 2018-03-04 00:00:11.230 3 1
8 2018-03-04 00:00:11.980 4 1
9 2018-03-04 00:00:12.560 4 1
10 2018-03-04 00:00:13.120 0 2
11 2018-03-04 00:00:13.790 0 2
12 2018-03-04 00:00:14.330 1 2
13 2018-03-04 00:00:15.280 1 2
14 2018-03-04 00:00:15.960 2 2
15 2018-03-04 00:00:16.420 2 2
16 2018-03-04 00:00:17.090 3 2
Upvotes: 1
Reputation: 51395
I think you can use lt()
(less than), shift()
, and cumsum()
. Together, you can make these cumulatively count each time the column range
stops increasing (i.e. when the range
value is less than the previous range
value).
df['Key'] = df['range'].lt(df['range'].shift()).cumsum() + 1
>>> df
time range Key
0 2018-03-04 00:00:06.520 0 1
1 2018-03-04 00:00:07.130 0 1
2 2018-03-04 00:00:07.850 1 1
3 2018-03-04 00:00:08.420 1 1
4 2018-03-04 00:00:09.210 2 1
5 2018-03-04 00:00:10.070 2 1
6 2018-03-04 00:00:10.840 3 1
7 2018-03-04 00:00:11.230 3 1
8 2018-03-04 00:00:11.980 4 1
9 2018-03-04 00:00:12.560 4 1
10 2018-03-04 00:00:13.120 0 2
11 2018-03-04 00:00:13.790 0 2
12 2018-03-04 00:00:14.330 1 2
13 2018-03-04 00:00:15.280 1 2
14 2018-03-04 00:00:15.960 2 2
15 2018-03-04 00:00:16.420 2 2
16 2018-03-04 00:00:17.090 3 2
Upvotes: 2