Esben Eickhardt
Esben Eickhardt

Reputation: 3852

R: Remove substring within string

Is there an elegant way to remove a sub-string within a string based on the index of the characters?

Here is how I do it now:

# My data
mystring <- "Hello, how are {you} doing?"
index_of_substring <- c(16,20)

# Pasting two substrings
mystring_no_substring <- paste0(substr(mystring, 1, index_of_substring[1]-1), substr(mystring, index_of_substring[2]+1, nchar(mystring)))

# Cleaning extra spaces
mystring_no_substring <- gsub("  ", " ", mystring_no_substring)

I could of course write this up to a general function, but I was just wondering if there was an elegant solution out there, e.g. to substitute an index in a string with nothing or another word.

Note: This is not a regex question.

Upvotes: 2

Views: 3847

Answers (3)

M.Bergen
M.Bergen

Reputation: 174

I believe my solution is pretty much what you'd get if coded your method as a general function but here you go. I first use a custom function called "strpos_fixed" to index the substring I'd like to remove. I not quite as comfotable as I'd like to be with regex so I restrict this function to fixed matching for simplicity sake.

strpos_fixed=function(x,y){
  a<-regexpr(y, x,fixed=T)
  b<-a[1]
  return(b)
}


rm_substr<-function(string,rm_start,rm_end){

  sub1<-substr(string,1,strpos_fixed(string, rm_start)-1)

  sub2<-substr(string, strpos_fixed(string,rm_end)+nchar(rm_end), 
               nchar(string))

  new <- gsub("\\s{2,}"," ",paste(sub1, sub2))

  return(new)
}

mystring <- "Hello, how are {you} doing?"
rm_substr(mystring, "{", "}")

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 269644

1) strsplit/paste Break up the input into characters, omit the ones between 16 and 20 inclusive, collapse it back together and replace runs of spaces with single spaces. Uses base functions only.

gsub(" +", " ", paste(strsplit(s, "")[[1]][-seq(ix[1], ix[2])], collapse = ""))
## [1] "Hello, how are doing?"

2) substr<- Replace the indicated characters with spaces and then reduce runs of spaces to a single space. Only base functions are used.

gsub(" +", " ", "substr<-"(s, ix[1],  ix[2], gsub(".", " ", s)))
## [1] "Hello, how are doing?"

Note that this is non-destructive, i.e. it outputs the result without modifying the input.

Note: We used test input:

s <- "Hello, how are {you} doing?"
ix <- c(16, 20)

Upvotes: 2

sm925
sm925

Reputation: 2678

You can use paste0 and substr like this too:-

paste0(substr(mystring, 1, 14), substr(mystring, 21, 27))

Upvotes: 1

Related Questions