Reputation: 65
I have a column in a data table which has entries in non-decreasing order. But there can be duplicate entries.
labels <- c(123,123,124,125,126,126,128)
time <- data.table(labels,unique_labels="")
time
labels unique_labels
1: 123
2: 123
3: 124
4: 125
5: 126
6: 126
7: 128
I want to make all entries unique, so the output will be
time
labels unique_labels
1: 123 123
2: 123 124
3: 124 125
4: 125 126
5: 126 127
6: 126 128
7: 128 130
Following is a loop implementation for this:
prev_label <- 0
unique_counter <- 0
for (i in 1:length(time$label)){
if (time$label[i]!=prev_label)
prev_label <- time$label[i]
else
unique_counter <- unique_counter + 1
time$unique_label[i] <- time$label[i] + unique_counter
}
Upvotes: 1
Views: 87
Reputation: 76683
There's a vectorized solution that completly prevents you from using for
loops.
Since time
is a R
function I've changed the name of your data.frame
to tm
.
cumsum(duplicated(tm$labels)) + tm$labels
[1] 123 124 125 126 127 128 130
tm$unique_labels <- cumsum(duplicated(tm$labels)) + tm$labels
tm
labels unique_labels
1: 123 123
2: 123 124
3: 124 125
4: 125 126
5: 126 127
6: 126 128
7: 128 130
Upvotes: 2
Reputation: 870
tank = ("t", 1:NROW(labels), sep="")
time$unique_labels = ifelse(duplicated(time), tank, time$labels)
the duplicated
function of the data.table
package returns the index of duplicated rows of your dataset, just replace them with "random" values you are sure are not used in your set
Upvotes: 1