Create unique id for each binary section

Question

I have made a binary column "y/n" represented by 1 and 0 (yes and no, respecively). I now want to give each section marked 1 a unique id based on the name of the file and the position in the column. Below is an example of what I would like it to look like. I have no preference for what 0 should be, just as long as the sections marked as 1 have a unique id.

> y/n         id                       
> 1                    catid_a                 
> 1                    catid_a                                      
> 1                    catid_a                 
> 0                    no_id                                     
> 1                    catid_b                                     
> 1                    catid_b                 
> 0                    no_id

Usually to name an id I use data$id <- as.factor(substr(basename(files[i]),1,13)) but it doesn't work in this instance as I want to have multiple id's in a column this just gives one.. does anyone have any ideas?

Thanks! Grace

akrun · Accepted Answer

We can use rle

df1$id <-inverse.rle(within.list(rle(df1$`y/n`), {val1 <- values
               val1[values!=0] <- paste0("catid_", letters[seq_along(values[values!=0])])
              val1[values==0] <- "no_id"
                values <- val1}))
df1$id
#[1] "catid_a" "catid_a" "catid_a" "no_id"   "catid_b" "catid_b" "no_id"

Or another option is rleid from data.table

library(data.table)
setDT(df1)[, grp := rleid(`y/n`)][`y/n`==0,  id := 'no_id' ,grp
      ][is.na(id), id := paste0("catid_", letters[.GRP]), grp][, grp := NULL][]
#   y/n      id
#1:   1 catid_a
#2:   1 catid_a
#3:   1 catid_a
#4:   0   no_id
#5:   1 catid_b
#6:   1 catid_b
#7:   0   no_id

data

df1 <- structure(list(`y/n` = c(1, 1, 1, 0, 1, 1, 0)), .Names = "y/n", row.names = c(NA, 
 -7L), class = "data.frame")

Create unique id for each binary section

Answers (2)

data

Related Questions