Ben
Ben

Reputation: 42293

Remove periods before the first comma in a string

How can I remove the periods before the first comma in these strings?

 xx <- c("fefe.3. fregg, ff, 34.gr. trgw", 
          "fefe3. fregg, ff, 34.gr. trgw",
          "fefe3 fregg, ff, 34.gr. tr.gw")

Desired output:

    "fefe3 fregg, ff, 34.gr. trgw"
    "fefe3 fregg, ff, 34.gr. trgw"
    "fefe3 fregg, ff, 34.gr. tr.gw" 

I've started with gsub("\\.","", xx)), which removes all periods. How to change it to specify 'only the period before the first comma'?

Upvotes: 4

Views: 383

Answers (4)

G. Grothendieck
G. Grothendieck

Reputation: 269664

This uses gsubfn in the gsubfn package to extract the longest substring starting at the beginning of the string and containing no commas. (This would be the entire string if there were no commas in it). It then uses gsub to remove the periods within that. (If it were desired to only remove the first period within the substring then change the gsub to sub.)

library(gsubfn)
gsubfn("^[^,]*", ~ gsub("\\.", "", x), xx)

The result is:

[1] "fefe3 fregg, ff, 34.gr. trgw" 
[2] "fefe3 fregg, ff, 34.gr. trgw" 
[3] "fefe3 fregg, ff, 34.gr. tr.gw"

Upvotes: 1

Tyler Rinker
Tyler Rinker

Reputation: 109874

I don't know about speed or amount of typing but here's an approach using qdap's beg2char and char2end functions:

## xx <- c("fefe.3. fregg, ff, 34.gr. trgw", 
##     "fefe3. fregg, ff, 34.gr. trgw",
##     "fefe3 fregg, ff, 34.gr. tr.gw")

library(qdap)

paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE))

## > paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE))
## [1] "fefe3 fregg, ff, 34.gr. trgw"  "fefe3 fregg, ff, 34.gr. trgw" 
## [3] "fefe3 fregg, ff, 34.gr. tr.gw"

Upvotes: 1

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193527

I feel like this is cheating, but it works for this simple example....

xx <- c("fefe.3. fregg, ff, 34.gr. trgw", 
        "fefe3. fregg, ff, 34.gr. trgw",
        "fefe3 fregg, ff, 34.gr. tr.gw")

temp <- strsplit(xx, ",")

sapply(seq_along(temp), function(x) {
  t1 <- gsub("\\.", "", temp[[x]][1])
  paste(t1, temp[[x]][2], temp[[x]][-c(1, 2)], sep = ",")
})
# [1] "fefe3 fregg, ff, 34.gr. trgw"  "fefe3 fregg, ff, 34.gr. trgw" 
# [3] "fefe3 fregg, ff, 34.gr. tr.gw"

The basic idea above is that since you're only going to be looking for a period in the first chunk before a comma, why not split it and use a basic gsub on that, and then put the pieces back together. Unlikely to be efficient....

Upvotes: 4

Andrie
Andrie

Reputation: 179438

Try this:

gsub("\\.(.*,.*)","\\1", xx)
[1] "fefe3 fregg, ff, 34.gr. trgw" 
[2] "fefe3 fregg, ff, 34.gr. trgw" 
[3] "fefe3 fregg, ff, 34.gr. tr.gw"

The regex works like this:

  • \\. looks for a period
  • (.*,.*) looks for a comma inside other text, and groups it
  • \\1 refers to the first group

Upvotes: 3

Related Questions