cory
cory

Reputation: 6659

replace every other space with new line

I have strings like this:

a <- "this string has an even number of words"
b <- "this string doesn't have an even number of words"

I want to replace every other space with a new line. So the output would look like this...

myfunc(a)
# "this string\nhas an\neven number\nof words"
myfunc(b)
# "this string\ndoesn't have\nan even\nnumber of\nwords"

I've accomplished this by doing a strsplit, paste-ing a newline on even numbered words, then paste(a, collapse=" ") them back together into one string. Is there a regular expression to use with gsub that can accomplish this?

Upvotes: 3

Views: 2520

Answers (1)

Frank
Frank

Reputation: 66819

@Jota suggested a simple and concise way:

myfunc  = function(x) gsub("( \\S+) ", "\\1\n", x)       # Jota's    
myfunc2 = function(x) gsub("([^ ]+ [^ ]+) ", "\\1\n", x) # my idea

lapply(list(a,b), myfunc)


[[1]]
[1] "this string\nhas an\neven number\nof words"

[[2]]
[1] "this string\ndoesn't have\nan even\nnumber of\nwords"

How it works. The idea of "([^ ]+ [^ ]+) " regex is (1) "find two sequences of words/nonspaces with a space between them and a space after them" and (2) "replace the trailing space with a newline".

@Jota's "( \\S+) " is trickier -- it finds any word with a space before and after it and then replaces the trailing space with a newline. This works because the first word that is caught by this is the second word of the string; and the next word caught by it is not the third (since we have already "consumed"/looked at the space in front of the third word when handling the second word), but rather the fourth; and so on.

Oh, and some basic regex stuff.

  • [^xyz] means any single char except the chars x, y, and z.
  • \\s is a space, while \\S is anything but a space
  • x+ means x one or more times
  • (x) "captures" x, allowing for reference in the replacement, like \\1

Upvotes: 8

Related Questions