Prevent duplicates in R

Question

I have a column in a data table which has entries in non-decreasing order. But there can be duplicate entries.

labels <- c(123,123,124,125,126,126,128)
time <- data.table(labels,unique_labels="")
time
  labels unique_labels
1:    123              
2:    123              
3:    124              
4:    125              
5:    126              
6:    126              
7:    128

I want to make all entries unique, so the output will be

time
      labels unique_labels
1:    123     123           
2:    123     124         
3:    124     125         
4:    125     126         
5:    126     127         
6:    126     128         
7:    128     130

Following is a loop implementation for this:

prev_label <- 0
unique_counter <- 0
for (i in 1:length(time$label)){
    if (time$label[i]!=prev_label)
        prev_label <- time$label[i]
    else
        unique_counter <- unique_counter + 1
    time$unique_label[i] <- time$label[i] + unique_counter
}

Rui Barradas · Accepted Answer

There's a vectorized solution that completly prevents you from using for loops. Since time is a R function I've changed the name of your data.frame to tm.

cumsum(duplicated(tm$labels)) + tm$labels
[1] 123 124 125 126 127 128 130

tm$unique_labels <- cumsum(duplicated(tm$labels)) + tm$labels
tm
   labels unique_labels
1:    123           123
2:    123           124
3:    124           125
4:    125           126
5:    126           127
6:    126           128
7:    128           130

Prevent duplicates in R

Answers (2)

Related Questions