Igor Henrique
Igor Henrique

Reputation: 13

How to loop through multiple values in the same string?

I wanna loop through a sequence of letters 'ABCDEFGHIJK', but the loop in R loops over 1 value at a time. Is there a way to loop over 3 values at a time? In this case the sequence 'ABCDEFGHIJK' would be looped as 'ABC' then 'DEF' and so on.

I've tried to change the length of the function but I still didn't find a way, I can do this in python but I didn't find any information about it in R nor in the help option of R.

xp <-'ACTGCT'
for(i in 1:length(xp)){
  if(i == 'ACG'){
    print('T')
  }
}

Upvotes: 1

Views: 140

Answers (4)

G. Grothendieck
G. Grothendieck

Reputation: 269644

1) Base R Iterate over the sequence 1, 4, 7, ... and use substr to extract the 3 character portion of the input string starting at that position number. Then perform whatever processing that is desired. If there are fewer than 3 characters in the last chunk it will use whatever is available for that chunk. This is a particularly good approach if you want to exit early since a break can be inserted into the loop.

for(i in seq(1, nchar(xp), 3)) {
  s <- substr(xp, i, i+2)
  print(s) # replace with desired processing
}
## [1] "ACT"
## [1] "GCT"

1a) lapply We translate the loop to lapply or sapply if one iteration does not depend on another.

process <- function(i) { 
  s <- substr(xp, i, i+2)
  s  # replace with desired processing
}
sapply(seq(1, nchar(xp), 3), process)
## [1] "ACT" "GCT"

2) rollapply Another possibility is to break the string up into single characters and then iterate over those passing a 3 element vector of single characters to the indicated function. Here we have used toString to process each chunk but that can be replaced with any other suitable function.

library(zoo)
rollapply(strsplit(xp, "")[[1]], 3, by = 3, toString, align = "left", partial = TRUE)
## [1] "A, C, T" "G, C, T"

Upvotes: 0

Andrew
Andrew

Reputation: 5138

Here is a stringr solution that outputs a list for whether or not there are matches:

library(stringr)

# Split string into sequences of 3 (or fewer if length is not multiple of 3)
split_strings <- str_extract_all("ABCDEFGHIJK", ".{1,3}", simplify = T)[1,]

# The strings you want to loop through / search for
x <- c("ABC", "DEF", "GHI", "LMN")

# Output is named list
sapply(x, `%in%`, split_strings, simplify = F)

$ABC
[1] TRUE

$DEF
[1] TRUE

$GHI
[1] TRUE

$LMN
[1] FALSE

Or, if you only want to look for one element:

"ABC" %in% split_strings
[1] TRUE

Upvotes: 2

akrun
akrun

Reputation: 887128

An option would be to split the string over each 3 characters and then do the comparison

lapply(strsplit(v1, "(?<=.{3})", perl = TRUE), function(x) x== 'ACG')
#[[1]]
#[1] FALSE FALSE FALSE FALSE

data

v1 <- 'ABCDEFGHIJK'

Upvotes: 2

Sotos
Sotos

Reputation: 51592

We can use the vectorized substring, i.e.

substring('ABCDEFGHIJK', seq(1, nchar('ABCDEFGHIJK') - 1, 3), seq(3, nchar('ABCDEFGHIJK'), 3)) == 'ACG'
#[1] FALSE FALSE FALSE FALSE

NOTE: This will only extract 3-characters. So If at the end you are left with 2 characters, it will not return them. For the above example, it outputs:

substring('ABCDEFGHIJK', seq(1, nchar('ABCDEFGHIJK') - 1, 3), seq(3, nchar('ABCDEFGHIJK'), 3))
#[1] "ABC" "DEF" "GHI" ""

Upvotes: 2

Related Questions