Reputation: 42293
How can I remove the periods before the first comma in these strings?
xx <- c("fefe.3. fregg, ff, 34.gr. trgw",
"fefe3. fregg, ff, 34.gr. trgw",
"fefe3 fregg, ff, 34.gr. tr.gw")
Desired output:
"fefe3 fregg, ff, 34.gr. trgw"
"fefe3 fregg, ff, 34.gr. trgw"
"fefe3 fregg, ff, 34.gr. tr.gw"
I've started with gsub("\\.","", xx))
, which removes all periods. How to change it to specify 'only the period before the first comma'?
Upvotes: 4
Views: 383
Reputation: 269664
This uses gsubfn
in the gsubfn package to extract the longest substring starting at the beginning of the string and containing no commas. (This would be the entire string if there were no commas in it). It then uses gsub
to remove the periods within that. (If it were desired to only remove the first period within the substring then change the gsub
to sub
.)
library(gsubfn)
gsubfn("^[^,]*", ~ gsub("\\.", "", x), xx)
The result is:
[1] "fefe3 fregg, ff, 34.gr. trgw"
[2] "fefe3 fregg, ff, 34.gr. trgw"
[3] "fefe3 fregg, ff, 34.gr. tr.gw"
Upvotes: 1
Reputation: 109874
I don't know about speed or amount of typing but here's an approach using qdap's beg2char
and char2end
functions:
## xx <- c("fefe.3. fregg, ff, 34.gr. trgw",
## "fefe3. fregg, ff, 34.gr. trgw",
## "fefe3 fregg, ff, 34.gr. tr.gw")
library(qdap)
paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE))
## > paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE))
## [1] "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. trgw"
## [3] "fefe3 fregg, ff, 34.gr. tr.gw"
Upvotes: 1
Reputation: 193527
I feel like this is cheating, but it works for this simple example....
xx <- c("fefe.3. fregg, ff, 34.gr. trgw",
"fefe3. fregg, ff, 34.gr. trgw",
"fefe3 fregg, ff, 34.gr. tr.gw")
temp <- strsplit(xx, ",")
sapply(seq_along(temp), function(x) {
t1 <- gsub("\\.", "", temp[[x]][1])
paste(t1, temp[[x]][2], temp[[x]][-c(1, 2)], sep = ",")
})
# [1] "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. trgw"
# [3] "fefe3 fregg, ff, 34.gr. tr.gw"
The basic idea above is that since you're only going to be looking for a period in the first chunk before a comma, why not split it and use a basic gsub
on that, and then put the pieces back together. Unlikely to be efficient....
Upvotes: 4
Reputation: 179438
Try this:
gsub("\\.(.*,.*)","\\1", xx)
[1] "fefe3 fregg, ff, 34.gr. trgw"
[2] "fefe3 fregg, ff, 34.gr. trgw"
[3] "fefe3 fregg, ff, 34.gr. tr.gw"
The regex works like this:
\\.
looks for a period(.*,.*)
looks for a comma inside other text, and groups it\\1
refers to the first groupUpvotes: 3