Reputation: 21961
I have the foll. dataframe:
Cnt Year JD Min_Temp
S 2000 1 277.139
S 2000 2 274.725
S 2001 1 270.945
S 2001 2 271.505
N 2000 1 257.709
N 2000 2 254.533
N 2000 3 258.472
N 2001 1 255.763
N 2001 2 265.714
N 2001 3 267.943
I would like to add a new column where each separate row for a given 'Cnt' is given a unique identifier (1,2,3...). So, the result should look like this:
Cnt Year JD Min_Temp unq
S 2000 1 277.139 1
S 2000 2 274.725 2
S 2001 1 270.945 3
S 2001 2 271.505 4
N 2000 1 257.709 1
N 2000 2 254.533 2
N 2000 3 258.472 3
N 2001 1 255.763 4
N 2001 2 265.714 5
N 2001 3 267.943 6
Here, each row corresponding to the same value in the column 'Cnt' as a unique identifier.
Currently, all I can do is add a new column with increasing values: df['unq'] = numpy.arange(1,len(df))
Upvotes: 0
Views: 1635
Reputation: 353019
You could use groupby
with cumcount
>>> df["unq"] = df.groupby("Cnt").cumcount() + 1
>>> df
Cnt Year JD Min_Temp unq
0 S 2000 1 277.139 1
1 S 2000 2 274.725 2
2 S 2001 1 270.945 3
3 S 2001 2 271.505 4
4 N 2000 1 257.709 1
5 N 2000 2 254.533 2
6 N 2000 3 258.472 3
7 N 2001 1 255.763 4
8 N 2001 2 265.714 5
9 N 2001 3 267.943 6
Note that because the groups are based on the Cnt column values and not on contiguity, if you have a second group of S below the group of N, the first unq
value in that group will be 5.
Upvotes: 3