Daman deep
Daman deep

Reputation: 631

Create sequence Unique IDs of 4 to 7 digits in a dataframe in R

> dput(Data)
structure(list(DISTRICT = c("KARIMGANJ", "KARIMGANJ", "KARIMGANJ", 
"KARIMGANJ", "KARIMGANJ", "HAILAKANDI", "HAILAKANDI", "HAILAKANDI", 
"CACHAR", "CACHAR")), row.names = c(NA, -10L), class = "data.frame")

I want to create Unique IDs I have >thousand rows but logic will be same I guess. How do I create this new column as shown in the expected output? Note that sequence can be of 7 digits as well.

#     
     DISTRICT   ID
1   KARIMGANJ 1111
2   KARIMGANJ 1111
3   KARIMGANJ 1111
4   KARIMGANJ 1111
5   KARIMGANJ 1111
6  HAILAKANDI 1112
7  HAILAKANDI 1112
8  HAILAKANDI 1112
9      CACHAR 1113
10     CACHAR 1113

Upvotes: 0

Views: 51

Answers (2)

Rui Barradas
Rui Barradas

Reputation: 76545

Here is a function that accepts an argument start as the first new id.

new_id <- function(X, start = 1111){
  sp <- split(X[[1]], X[[1]])
  sp <- sp[unique(X[[1]])]
  n <- rep(seq_along(sp), lengths(sp)) - 1
  sprintf("%s", start + n)
}

Data$ID <- new_id(Data)

Upvotes: 1

GKi
GKi

Reputation: 39707

You can use factor to create a unique ID.

Data$ID <- unclass(factor(Data$DISTRICT)) + 1000
#     DISTRICT   ID
#1   KARIMGANJ 1003
#2   KARIMGANJ 1003
#3   KARIMGANJ 1003
#4   KARIMGANJ 1003
#5   KARIMGANJ 1003
#6  HAILAKANDI 1002
#7  HAILAKANDI 1002
#8  HAILAKANDI 1002
#9      CACHAR 1001
#10     CACHAR 1001

Or to start with the first hit with 1 using match and unique.

Data$ID <- match(Data$DISTRICT, unique(Data$DISTRICT)) + 1000
#     DISTRICT   ID
#1   KARIMGANJ 1001
#2   KARIMGANJ 1001
#3   KARIMGANJ 1001
#4   KARIMGANJ 1001
#5   KARIMGANJ 1001
#6  HAILAKANDI 1002
#7  HAILAKANDI 1002
#8  HAILAKANDI 1002
#9      CACHAR 1003
#10     CACHAR 1003

Upvotes: 2

Related Questions