Reputation: 504
I have a list of string like so:
batch1, batch2, batch3, batch10, batch11
I am trying to add a 0 before the single digits batch01, batch02, batch03, batch10, batch11
I have found many similar questions and tried to write my own regex. I am very close, but I can't quite make it do what I want.
Batch <- gsub('(.{5})([0-9]{1}\\b)','\\10\\2', Batch)
outputs batch01, batch02, batch 03, batch100, batch110
\\s
instead of \\b
doesn't change any values
sampleNames$Batch <- gsub('(.{5})([0-9]{1})','\\10\\2', sampleNames$Batch)
outputs bacth01, batch02, batch03, batch010, batch011
I've played around with a few other versions but I cannot seem to get it correct. I know this is a somewhat repetitive question, but I have not been able to alter previous solutions to do what I need to do.
Upvotes: 2
Views: 102
Reputation: 21908
You can also use the following solution:
sapply(vec, function(x) {
d <- gsub("([[:alpha:]]+)(\\d)", "\\2", x)
if(nchar(d) == 1) {
gsub("([[:alpha:]]+)(\\d)", "\\10\\2", x)
} else {
x
}
})
batch1 batch2 batch3 batch10 batch11
"batch01" "batch02" "batch03" "batch10" "batch11"
Upvotes: 1
Reputation: 18611
Use
sampleNames$Batch <- sub("(\\D|^)(\\d)$", "\\10\\2", sampleNames$Batch, perl=TRUE)
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\D non-digits (all but 0-9)
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
Upvotes: 1
Reputation: 887028
We can capture the last digit and the lower case letter before it as two groups, then in the replacement specify the backreference of the groups and the 0 in between. Thus, it won't match the ones having two digits at the end of the string
sub("([a-z])(\\d)$", "\\10\\2", Batch)
[1] "batch01" "batch02" "batch03" "batch10" "batch11"
Or we may use sprintf/str_pad
with str_replace
library(stringr)
str_replace(Batch, "\\d+$", function(x) sprintf("%02d", as.numeric(x)))
[1] "batch01" "batch02" "batch03" "batch10" "batch11"
Batch <- c("batch1", "batch2", "batch3", "batch10", "batch11")
Upvotes: 1