ThallysHoelz
ThallysHoelz

Reputation: 13

Regex: Replacing all spaces between two characters

Consider the following string: This is an example: this is another one, and this is yet another, and other, and so on. I want to replace all space characters between : and ,. So it would look like this This is an example:_this_is_another_one, and this is yet another, and other, and so on.

What I've tried so far:

Upvotes: 1

Views: 3833

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

Update: there is a simple way to replace anything in between arbitrary strings in R using stringr::str_replace_all using an anonymous function as the replacement argument:

Generic stringr approach

library(stringr)

# left - left boundary
# right - right boundary
# x - input
# what - regex pattern to search for inside matches
# repl - replacement text for the in-pattern matches
ReplacePatternBetweenTwoStrings <- function(left, right, x, what, repl) {
  left  <- gsub("([][{}()+*^${|\\\\?.])", "\\\\\\1", left)
  right <- gsub("([][{}()+*^${|\\\\?.])", "\\\\\\1", right)
  str_replace_all(x, 
     paste0("(?s)(?<=", left, ").*?(?=", right, ")"),
     function(z) gsub(what, repl, z)
  )
}

x <- "This is an example: this is another one, and this is yet another, and other, and so on."
ReplacePatternBetweenTwoStrings(":", ",", x, "\\s+", "_")
## => [1] "This is an example:_this_is_another_one, and this is yet another, and other, and so on."

See this R demo.

Replacing all whitespaces between the closest : and ,

This is a simple edge case of the above when :[^:,]+, matches a :, then any amount of chars other than : and , (the delimiter chars) and then a ,, then the whitespaces are replaced with underscores in the matches only:

stringr::str_replace_all(x, ":[^:,]+,", function(z) gsub("\\s+", "_", z))

See the regex demo

Original answer (scales rather poorly)

You may use the following regex:

(?:\G(?!^)|:)[^,]*?\K\s(?=[^,]*,)

Replace with _. See the regex demo.

Details

  • (?:\G(?!^)|:) - the end of the previous match (\G(?!)^) or a colon
  • [^,]*? - any 0+ chars other than , as few as possible
  • \K - match reset operator discarding the text matched so far
  • \s - a whitespace
  • (?=[^,]*,) - a positive lookahead check that makes sure there is a , after zero or more chars other than a comma.

R demo:

re <- "(?:\\G(?!^)|:)[^,]*?\\K\\s(?=[^,]*,)"
x <- "This is an example: this is another one, and this is yet another, and other, and so on."
gsub(re, "_", x, perl=TRUE)
# => [1] "This is an example:_this_is_another_one, and this is yet another, and other, and so on."

Upvotes: 4

ags29
ags29

Reputation: 2696

Here is a slightly gross answer:

txt="This is an example: this is another one, and this is yet"

split_str=unlist(strsplit(gsub("^(.*:)(.*)(,.*)", "\\1$\\2$\\3", txt), split="$", fixed=T))

paste0(split_str[1], gsub(" ", "_",split_str[2]), split_str[3])

Upvotes: 0

Related Questions