R regex trimming a string whitespace

Question

I have a string that downloaded from the web:

x = "the company 's newly launched cryptocurrency , Libra , hasn 't been contacted by Facebook , according to a report ."

They parsed the string such that: ...In addition, contracted words like (can't) are separated into two parts (ca n't) and punctuation is separated from words (eye level . As her).

I want to make the string back to normal, for example:

x = "the company's newly launched cryptocurrency, Libra, hasn't been contacted by Facebook, according to a report."

How do I trim the space before the punctuation?

Have though about using str_remove_all with regex:

str_remove_all(x,"\s[[:punct:]]'")

but it will also remove the punctuation.

Any ideas?

Eyayaw · Accepted Answer

With back referencing:

x <- "the company 's newly launched cryptocurrency , Libra , hasn 't been contacted by Facebook , according to a report ."

gsub("(\s+)([[:punct:]])", "\2", x, perl = TRUE)

# [1] "the company's newly launched cryptocurrency, Libra, hasn't been contacted by Facebook, according to a report."

R regex trimming a string whitespace

Answers (2)

Related Questions