Reputation: 109984
I have a strange request with regex in R. I have vector of character strings where some have multiple trailing periods. I want to replace these periods with spaces. The example and desired outcome should make clear what I'm after (maybe I need to attack it with what I give to replace argument rather than the pattern argument of gsub
):
Example and attempts:
x <- c("good", "little.bad", "really.ugly......")
gsub("\\.$", " ", x)
#produces this
#[1] "good" "little.bad" "really.ugly..... "
gsub("\\.+$", " ", x)
#produces this
#[1] "good" "little.bad" "really.ugly "
Desired outcome
[1] "good" "little.bad" "really.ugly "
So the original vector (x) had the last string with 6 periods at the end so I'd like 6 spaces without touching the period between really and ugly. I know the $
looks at the end but can't get past this.
Upvotes: 11
Views: 4457
Reputation: 61953
Tim's solution is clearly better but I figured I'd try my hand at an alternate way. Using liberal use of regmatches
helps us out here
x <- c("good", "little.bad", "really.ugly......")
# Get an object with 'match data' to feed into regmatches
# Here we match on any number of periods at the end of a string
out <- regexpr("\\.*$", x)
# On the right hand side we extract the pieces of the strings
# that match our pattern with regmatches and then replace
# all the periods with spaces. Then we use assignment
# to store that into the spots in our strings that match the
# regular expression.
regmatches(x, out) <- gsub("\\.", " ", regmatches(x, out))
x
#[1] "good" "little.bad" "really.ugly "
So not quite as clean as a single regular expression. But I've never really gotten around to learning those 'lookahead's in perl regular expressions.
Upvotes: 2
Reputation: 109984
While I waited for a regex solution that makes sense I decided to come up with a nonsensical way to solve this:
messy.sol <- function(x) {
paste(unlist(list(gsub("\\.+$", "", x),
rep(" ", nchar(x) - nchar(gsub("\\.+$", "", x))))),collapse="")
}
sapply(x, messy.sol, USE.NAMES = FALSE)
I'd say Tim's is a bit prettier :)
Upvotes: 2
Reputation: 336418
Try this:
gsub("\\.(?=\\.*$)", " ", mystring, perl=TRUE)
Explanation:
\. # Match a dot
(?= # only if followed by
\.* # zero or more dots
$ # until the end of the string
) # End of lookahead assertion.
Upvotes: 17