Reputation: 631
> dput(Data)
structure(list(DISTRICT = c("KARIMGANJ", "KARIMGANJ", "KARIMGANJ",
"KARIMGANJ", "KARIMGANJ", "HAILAKANDI", "HAILAKANDI", "HAILAKANDI",
"CACHAR", "CACHAR")), row.names = c(NA, -10L), class = "data.frame")
I want to create Unique IDs I have >thousand rows but logic will be same I guess. How do I create this new column as shown in the expected output? Note that sequence can be of 7 digits as well.
#
DISTRICT ID
1 KARIMGANJ 1111
2 KARIMGANJ 1111
3 KARIMGANJ 1111
4 KARIMGANJ 1111
5 KARIMGANJ 1111
6 HAILAKANDI 1112
7 HAILAKANDI 1112
8 HAILAKANDI 1112
9 CACHAR 1113
10 CACHAR 1113
Upvotes: 0
Views: 51
Reputation: 76545
Here is a function that accepts an argument start
as the first new id.
new_id <- function(X, start = 1111){
sp <- split(X[[1]], X[[1]])
sp <- sp[unique(X[[1]])]
n <- rep(seq_along(sp), lengths(sp)) - 1
sprintf("%s", start + n)
}
Data$ID <- new_id(Data)
Upvotes: 1
Reputation: 39707
You can use factor
to create a unique ID.
Data$ID <- unclass(factor(Data$DISTRICT)) + 1000
# DISTRICT ID
#1 KARIMGANJ 1003
#2 KARIMGANJ 1003
#3 KARIMGANJ 1003
#4 KARIMGANJ 1003
#5 KARIMGANJ 1003
#6 HAILAKANDI 1002
#7 HAILAKANDI 1002
#8 HAILAKANDI 1002
#9 CACHAR 1001
#10 CACHAR 1001
Or to start with the first hit with 1 using match
and unique
.
Data$ID <- match(Data$DISTRICT, unique(Data$DISTRICT)) + 1000
# DISTRICT ID
#1 KARIMGANJ 1001
#2 KARIMGANJ 1001
#3 KARIMGANJ 1001
#4 KARIMGANJ 1001
#5 KARIMGANJ 1001
#6 HAILAKANDI 1002
#7 HAILAKANDI 1002
#8 HAILAKANDI 1002
#9 CACHAR 1003
#10 CACHAR 1003
Upvotes: 2