Mahmoud
Mahmoud

Reputation: 401

Replace Nth occurrence of a character in a string with something else

Consider a = paste(1:10,collapse=", ") which results in

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

I would like to replace every n-th (say 4-th) occurrences of "," and replace it with something else (say "\n"). The desired output would be:

"1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

I am looking for a code that uses gsub (or something equivalent) and some form of regular expression to achieve this goal.

Upvotes: 8

Views: 8033

Answers (5)

Dorian Grv
Dorian Grv

Reputation: 499

This one can be replace with a string instead of a character. I did a function that you can use easily :)

A demo here to understand the regex

> a = paste(1:10,collapse=", ")
> a
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 2nd occurence
> gsub("(.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 3rd occurence
> gsub("(.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> # if you want the 4rd occurence
> gsub("(.*?,.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> # if you want the last occurence
> gsub("(.*,.*),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
> 
> 
> replace.occurence <- function(x, pattern, replacement, which.occu) {
+   if( which.occu == "last" ) {
+     gsub(paste0("(.*", pattern, ".*)", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+   } else {
+     gsub(paste0("(.*?", paste0(rep(paste0(pattern, ".*?"), which.occu - 1), collapse = ""), ")", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+   }
+ }
> 
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 2)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 3)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 4)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = "last")
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
> 
> replace.occurence(a, pattern = ", 3, 4,", replacement = ", 4, 3,", which.occu = 1)
[1] "1, 2, 4, 3, 5, 6, 7, 8, 9, 10"

Upvotes: 0

thelatemail
thelatemail

Reputation: 93813

regmatches as yet another alternative:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

fn <- ","
rp <- "\n"
n <- 4

regmatches(a, gregexpr(fn, a)) <- list(c(rep(fn,n-1),rp))
a
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

As a function:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x, fn, rp, n) {
    regmatches(x, gregexpr(fn, x)) <- list(c(rep(fn,n-1),rp))
    x
}
replN(a, ",", "\n", 4)
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10

You could even extend this to be vectorised over the replacement argument:

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x,fn,rp,n) {
    sel <- rep(fn, n*length(rp))
    sel[seq_along(rp)*n] <- rp
    regmatches(x, gregexpr(fn, x)) <- list(sel)
    x
}
replN(a, fn=",", rp=c("1st","2nd"), n=4)
#[1] "1, 2, 3, 41st 5, 6, 7, 82nd 9, 10"

Upvotes: 2

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

You can replace ((?:\d+, ){3}\d), with \1\n

You basically capture everything till fourth comma in group1 and comma separately and replace it with \1\n which replaces matched text with group1 text and newline, giving you the intended results.

Regex Demo

R Code demo

gsub("((?:\\d+, ){3}\\d),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")

Prints,

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

Edit:

To generalize above solution to any text, we can change \d to [^,]

New R code demo

gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "a, bb, ccc, dddd, 500, 600, 700, 800, 900, 1000")

Output,

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
[1] "a, bb, ccc, dddd\n 500, 600, 700, 800\n 900, 1000"

Upvotes: 12

Jilber Urbina
Jilber Urbina

Reputation: 61154

regex is the best alternative, nontheless here's another approach without regex

> str_vec <- strsplit(a, " ")[[1]] 
> where <- seq_along(str_vec) %% 4 == 0
> str_vec[where] <- sub(",", "\n", str_vec[where])
> paste(str_vec, collapse=" ")
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

Upvotes: 1

D.sen
D.sen

Reputation: 922

Using both regex and gsub.

a = paste(1:10,collapse=", ")
x <- gsub("([^,]*,[^,]*,[^,]*,[^,]*),", '\\1\n', a)
x
#> [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

Upvotes: 1

Related Questions