gladys_c_hugh
gladys_c_hugh

Reputation: 378

Return how much trimws has trimmed from a dataframe

I'm using trimws(x) to trim the white spaces from a dataset.

Like the output of a "find and replace" in excel, I'd like to know how much work trimws has done; specifically how much white space has been removed from the whole dataframe - mostly for my satisfaction, but also potentially then to further group the work that trimws does by other variables to see whether there's any pattern to where the white space is creeping in upstream.

Example:

x <- "  Some text.  "

trimws(x)

And then some output like:

# trimws removed 1708 white space characters and 13 new line characters

Upvotes: 1

Views: 115

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269885

This is the number of whitespace characters removed:

x <- "  Some text.  "
nchar(x) - nchar(trimws(x)) # no of whitespace characters removed
## [1] 4  

The only whitespace in the example is space but presumably the only occurrences of newline, return and tab are in the trimmed part so if these are possible too then this gives the number of spaces removed:

xx <- gsub("[\n\r\t]", "", x)
nchar(xx) - nchar(trimws(xx))  # no of spaces removed

and the difference between this and the first code snippet given above is the number of non-spaces removed.

Upvotes: 1

RLave
RLave

Reputation: 8374

Based on your example you can modify the current code for trimws() (avaiable here).

You just need to change sub() with grep() in order to count how many space char are found in x.

my_trimws = function(x, which = c("both", "left", "right")) {
  which = match.arg(which)
  mysub = function(re, x) grep(re, x, perl = TRUE)

  if (which == "left")
    n <- mysub("^[ \t\r\n]+", x)
  if (which == "right")
    n <- mysub("[ \t\r\n]+$", x)

  n <- sum(mysub("^[ \t\r\n]+", x), mysub("[ \t\r\n]+$", x))

  cat("trimws() removed ", n, " spaces\n") # prints to the console
  return(n)
}

Your example:

x <- "  Some text.  "
r <- my_trimws(x)
#trimws() removed  2  spaces
# r
# [1] 2

Upvotes: 0

Related Questions