Mark
Mark

Reputation: 1769

Truncate string (without truncate words)

I have the following string:

x <- "This string is moderately long"

from which I would like to obtain a sentence like e.g.

"This string is"

or

"This string is moderately"

but not, e.g.,

"This string is m..."

Function str_trunc produces a wrong result:

rbind(
  str_trunc(x, 20, "right"),
  str_trunc(x, 20, "left"),
  str_trunc(x, 20, "center")
)
#>      [,1]                  
#> [1,] "This string is mo..."
#> [2,] "...s moderately long"
#> [3,] "This stri...ely long"

Upvotes: 2

Views: 565

Answers (2)

Allan Cameron
Allan Cameron

Reputation: 174338

I interpreted the OP (perhaps wrongly) as wanting to be able to truncate strings to a certain length without words being cut off. An approach like this would be effective:

trunc_not_words <- function(s, len)
{
  if(len >= nchar(s)) return(s)
  s2 <- substr(s, 1, len)
  boundaries <- c(gregexpr("\\W", x)[[1]], nchar(s) + 1)
  if(min(boundaries) > nchar(s2)) return("")
  if(min(boundaries[boundaries > nchar(s2)]) == nchar(s2) + 1) return(s2)
  return(substr(s2, 1, max(boundaries[boundaries <= nchar(s2)]) - 1))
}

Which gives the following results for each value of 1 to the length of the string:

for(i in 1:nchar(x)) cat("#> ", i, ": \"", trunc_not_words(x, i), "\"\n", sep = "")
#> 1: ""
#> 2: ""
#> 3: ""
#> 4: ""
#> 5: "This"
#> 6: "This"
#> 7: "This"
#> 8: "This"
#> 9: "This"
#> 10: "This"
#> 11: "This string"
#> 12: "This string"
#> 13: "This string"
#> 14: "This string is"
#> 15: "This string is"
#> 16: "This string is"
#> 17: "This string is"
#> 18: "This string is"
#> 19: "This string is"
#> 20: "This string is"
#> 21: "This string is"
#> 22: "This string is"
#> 23: "This string is"
#> 24: "This string is"
#> 25: "This string is moderately"
#> 26: "This string is moderately"
#> 27: "This string is moderately"
#> 28: "This string is moderately"
#> 29: "This string is moderately"
#> 30: "This string is moderately long"

Upvotes: 3

akrun
akrun

Reputation: 887571

It is the default ellipsis argument which is .... If we change it to blank (""), then it would be

library(stringr)
str_trunc(x, 25, "right", ellipsis = "")
#[1] "This string is moderately"

Upvotes: 0

Related Questions