user3698046
user3698046

Reputation: 119

R programming: more flexible version of this for loop

Below is my R code that takes vector a and returns vector b. Vector b is supposed to be a unique identifier of vector a with a particular format. Note that a is sorted with all the same numbers next to each other.

a <- c(1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 6, 6, 6, 7, 8, 9, 9)
b <- NULL


for(i in 5:length(a)){
        if (a[i] == a[i - 1] & a[i] == a[i - 2] & a[i] == a[i - 3] & a[i] == a[i - 4])
            b[i] <- paste(a[i], "-", 4, sep="")
        else if (a[i] == a[i - 1] & a[i] == a[i - 2] & a[i] == a[i - 3])
            b[i] <- paste(a[i], "-", 3, sep="")
        else if (a[i] == a[i - 1] & a[i] == a[i - 2])
            b[i] <- paste(a[i], "-", 2, sep="")
        else if (a[i] == a[i - 1])
            b[i] <- paste(a[i], "-", 1, sep="")
        else 
            b[i] <- paste(a[i], "-", 0, sep="")
}

#The first 4 values in vector b have to manually entered 
#because the for loop checks up to 4 consecutive numbers in a
b[1] <- "1-0" 
b[2] <- "1-1"
b[3] <- "1-2"
b[4] <- "2-0"

b

The above code returns b as needed, however, if vector a has more than 4 consecutive numbers that are the same, then the for loop would yield b that contains some elements that are the same. How can this for-loop be improved such that any amount of the same consecutive numbers can be given the appropriate unique identifier.

I'm thinking of using some sort of nested for-loop but how can this be done inside an if statement?

Upvotes: 3

Views: 94

Answers (3)

Rick
Rick

Reputation: 898

# If you are sure the different groups are really sorted, this will work:
b <- tapply(1:length(a), a, FUN = function(x) (1:length(x)) -1 )
b <- paste(a, unlist(b), sep = "-")

Upvotes: 1

thelatemail
thelatemail

Reputation: 93938

Using ave and paste, which I now realise is essentially just a variation on @RichardScriven's answer:

paste(a, ave(a,a,FUN=seq_along) - 1, sep="-")
# [1] "1-0" "1-1" "1-2" "2-0" "2-1" "2-2" "3-0" "4-0" "5-0" "6-0" "6-1"
#[12] "6-2" "6-3" "7-0" "8-0" "9-0" "9-1"

Upvotes: 3

Rich Scriven
Rich Scriven

Reputation: 99361

This could probably replace your current loop. rle() is used to construct a sequence for each unique element of a, starting from zero. Then we can paste() them together with a - separator.

paste(a, sequence(rle(a)$lengths) - 1, sep = "-")
#  [1] "1-0" "1-1" "1-2" "2-0" "2-1" "2-2" "3-0" "4-0" "5-0" "6-0" "6-1"
# [12] "6-2" "6-3" "7-0" "8-0" "9-0" "9-1"

which is identical to your output from b

Upvotes: 5

Related Questions