Reputation: 218
I am looking to label samples depending on whether their number is 500 more or less than the sample before it. I've seen examples of conditional labelling but I can't find one which fits what I need.
For example my data looks like this:
column a
200
230
510
1200
1800
1700
2400
I am looking to label each sample depending on if they are close to each other by a maximum of 500. So the output would be:
column a column b
200 region1
230 region1
510 region1
1200 region2 #new region starts as there is more than 500 difference than 510 (690)
1400 region2
1700 region2
2400 region3 #new region starts as there is 700 difference from 1700
I've seen examples of conditional labelling, but for all of them there are a set number of labels (for example just binary labels) and I need the label number (region number) to increase with each new region. How can I do this? I have tried adapting other examples but I have made little process in both setting the if more than 500 new label condition and have sequential labelling.
Upvotes: 3
Views: 146
Reputation: 39707
You can use diff
and cumsum
x$b <- paste0("region", c(1, 1+cumsum(diff(x$a) > 500)))
x
# a b
#1 200 region1
#2 230 region1
#3 510 region1
#4 1200 region2
#5 1800 region3
#6 1700 region3
#7 2400 region4
Data
x <- data.frame(a=c(200,230,510,1200,1800,1700,2400))
Upvotes: 5