Gayatri Padhi
Gayatri Padhi

Reputation: 25

How to generate unique ids when mapping pairs of values from two sets in Python

I have 2 columns, 1st: Cluster, 2nd: Vehicle

    Cluster VehicleID
435 1   2
264 1   1
444 1   1
302 1   1
191 1   1
383 1   1
81  1   1
142 2   1
6   2   1
420 2   1
153 2   1
42  2   2
168 2   1
292 2   2
138 2   2
65  2   2
316 2   1
219 2   1
329 2   1
371 3   1

Basically, this tells cluster 1 has two vehicles:[1,2] and cluster has 1 vehicle. The above table is a small sample. So, I have cluster 1:[1,2], cluster 2:[1] What I want is the cluster 1's vehicle 1 needs to be mapped as 1, 2 as 2. But, cluster 2's vehicle 1 should be mapped as 3.

In short, they should be sequential and irrespective of the "Cluster" column.

I am not able to know where am I going wrong. Kindly help.

Upvotes: 2

Views: 346

Answers (1)

rafaelc
rafaelc

Reputation: 59274

You may begin with sorting such that you can take advantage of diffing to find when IDs change, and then use cumsum to retrieve your cumulative IDs.

initial = df.index

df = df.sort_values(['Cluster', 'VehicleID'])

df['new-ID'] = (df.VehicleID.diff().ne(0) | df.Cluster.diff().eq(1)).cumsum()

df.loc[initial] # back to initial ordering

     Cluster  VehicleID  new-ID
435        1          2       2
264        1          1       1
444        1          1       1
302        1          1       1
191        1          1       1
383        1          1       1
81         1          1       1
142        2          1       3
6          2          1       3
420        2          1       3
153        2          1       3
42         2          2       4
168        2          1       3
292        2          2       4
138        2          2       4
65         2          2       4
316        2          1       3
219        2          1       3
329        2          1       3
371        3          1       5

Upvotes: 2

Related Questions