Reputation:
how do I remove just the ".00" in a data frame column: so that 5.00% /GP per tonne
becomes 5% / GP per tonne
? I am trying something like the below but it is saying invalid regexp.
sched$duty_rate <- c("5.00% /GP per tonne","10.00% /GP per tonne")
for(i in 1:nrow(sched)){
sched$duty_rate[i] <- gsub("[.]?[0-9]*(?=%)","\\%",sched$duty_rate[i])
}
Upvotes: 2
Views: 146
Reputation: 163277
Your pattern [.]?[0-9]*(?=%)
matches an optional dot, optional digits and asserts %
to the right which matches more than a mandatory dot and only zeroes.
You can make the positive lookahead work using perl=TRUE
and match a dot and one or more zeroes using a more specific pattern [.]0+(?=%)
duty_rate <- c("5.00% /GP per tonne","10.00% /GP per tonne", "5.0% /GP per 2.000 tonne","10.0000% /GP per 20.000 tonne")
gsub("[.]0+(?=%)", "", duty_rate, perl=TRUE)
Output
[1] "5% /GP per tonne" "10% /GP per tonne"
[3] "5% /GP per 2.000 tonne" "10% /GP per 20.000 tonne"
See an R demo.
Upvotes: 0
Reputation: 18585
Also, fairly simple and readable implementation can be derived with use of str_remove
function offered via the stringr
package.
x <- c("5.00% /GP per tonne", "10.00% /GP per tonne")
stringr::str_remove(string = x, pattern = ".00")
#> [1] "5% /GP per tonne" "10% /GP per tonne"
Created on 2022-04-06 by the reprex package (v2.0.1)
Upvotes: 1
Reputation: 626758
You can use
sched$duty_rate <- gsub("\\.0+(%)", "\\1", sched$duty_rate)
Details:
\.
- a literal dot0+
- one or more 0
chars(%)
- Group 1: a %
char (so that we do not remove zeros from other contexts, only between .
and %
)See the R demo:
duty_rate <- c("5.00% /GP per tonne","10.00% /GP per tonne", "5.0% /GP per 2.000 tonne","10.0000% /GP per 20.000 tonne")
gsub("\\.0+(%)", "\\1", duty_rate)
## => [1] "5% /GP per tonne" "10% /GP per tonne"
## [3] "5% /GP per 2.000 tonne" "10% /GP per 20.000 tonne"
See the regex demo.
Upvotes: 1
Reputation: 51582
You can use sub
, i.e.
sub('.00', '', sched$duty_rate, fixed = TRUE)
[1] "5% /GP per tonne" "10% /GP per tonne"
Upvotes: 1