C_Z_
C_Z_

Reputation: 7796

Remove trailing and leading spaces and extra internal whitespace with one gsub call

I know you can remove trailing and leading spaces with

gsub("^\\s+|\\s+$", "", x)

And you can remove internal spaces with

gsub("\\s+"," ",x)

I can combine these into one function, but I was wondering if there was a way to do it with just one use of the gsub function

trim <- function (x) {
  x <- gsub("^\\s+|\\s+$|", "", x)
  gsub("\\s+", " ", x)
}

testString<- "  This is a      test. "

trim(testString)

Upvotes: 11

Views: 2328

Answers (6)

BrodieG
BrodieG

Reputation: 52637

Here is an option:

gsub("^ +| +$|( ) +", "\\1", testString)  # with Frank's input, and Agstudy's style

We use a capturing group to make sure that multiple internal spaces are replaced by a single space. Change " " to \\s if you expect non-space whitespace you want to remove.

Upvotes: 9

G. Grothendieck
G. Grothendieck

Reputation: 269481

If an answer not using gsub is acceptable then the following does it. It does not use any regular expressions:

paste(scan(textConnection(testString), what = "", quiet = TRUE), collapse = " ")

giving:

[1] "This is a test."

Upvotes: 1

agstudy
agstudy

Reputation: 121568

Using a positive lookbehind :

gsub("^ *|(?<= ) | *$",'',testString,perl=TRUE)
# "This is a test."

Explanation :

## "^ *"     matches any leading space 
## "(?<= ) "    The general form is (?<=a)b : 
             ## matches a "b"( a space here)
             ## that is preceded by "a" (another space here)
## " *$"     matches trailing spaces 

Upvotes: 8

karthik manchala
karthik manchala

Reputation: 13640

You can just add \\s+(?=\\s) to your original regex:

gsub("^\\s+|\\s+$|\\s+(?=\\s)", "", x, perl=T)

See DEMO

Upvotes: 6

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

You've asked for a gsub option and gotten good options. There's also rm_white_multiple from "qdapRegex":

> testString<- "  This is a      test. "
> library(qdapRegex)
> rm_white_multiple(testString)
[1] "This is a test."

Upvotes: 4

Alexey Ferapontov
Alexey Ferapontov

Reputation: 5169

You can also use nested gsub. Less elegant than the previous answers tho

> gsub("\\s+"," ",gsub("^\\s+|\\s$","",testString))
[1] "This is a test."

Upvotes: 0

Related Questions