jeanlain
jeanlain

Reputation: 418

replace separate parts of a string in R

substring() <- value or substr() <- value can only replace a single character range per word. I wonder what the best solution is if I want to replace several disjoint characters in a string. My current solution looks like this.

string <- "a string"
splitted <- strsplit(string,"",T)[[1]]
splitted[c(1,5,8)] <- c("that", "", "ks")
paste(splitted, collapse="")
[1] "that stinks"

Of course, this is a random example. I actually want to replace nucleotides in genes at hundred of separate positions. Note that single characters (bases) would always be replaced by single characters, as opposed to my example here.

Alternatively I could call substr() <- value successively in a loop (I don't think I could avoid a loop if I used substr() since I would need to process the previous result several times), but that would probably be slower.

Thanks for the suggestions.

EDIT : my example was misleading, here is my test function

replaceCharsInString <-function(string, positions, replacement) {
    splitted <- strsplit(string,"",T)[[1]]
    splitted[positions] <- replacement   #'replacement' is a character vector
    paste(splitted,collapse="")
}

> replaceCharsInString("ACCTTTAAGAGATTTAGGGAGA", c(2,5,7), c("G","C","C"))
[1] "AGCTCTCAGAGATTTAGGGAGA"

Upvotes: 2

Views: 343

Answers (2)

Paul James
Paul James

Reputation: 530

I don't really understand what you are looking for exactly since you even say your example doesn't represent what you are actually doing.

May be possible by using () also called a capturing group:

gsub("(.*)(this)(.*)", '\\1him\\3', 'get this off my desk')
[1] "get him off my desk"

The parentheses create groups. R can then reference the captured group numbers using the double back slash notation: \\1, \\2, etc. Here I have 3 groups

  1. get
  2. this
  3. off my desk

In my code, I am replacing this (group 2) with him.

Upvotes: 3

Andrew Taylor
Andrew Taylor

Reputation: 3488

After finishing this, maybe my way is more complicated, but here we go:

f <- function(x, strings, replaces){
  e <- new.env()
  e$x <- x
  if(length(strings)!=length(replaces)) print("Strings should have the same number of elements as replaces") else {

  foo2 <- function(i){
  e$x <- gsub(strings[i], replaces[i], e$x)
}
lapply(1:length(strings), foo2)

}
return(e$x)
}


string <- "a string"
strings <- c("a", "r", "ng")
replaces <- c("that", "", "nks")


f(string, strings, replaces)


[1] "that stinks"

Upvotes: 2

Related Questions