Reputation: 21
I am attempting to re-name some character strings given to me in a large list. The issue is that I only need to replace some of the characters not all of them.
exdata <- c("i_am_having_trouble_with_this_string",
"i_am_wishing_files_were_cleaner_for_me",
"any_help_would_be_greatly_appreciated")
From this list, for example, I would like to replace the third through the fifth instance of "_" with "-". I am having trouble understanding the regex coding for this, as most examples split strings up instead of keeping them intact.
Upvotes: 1
Views: 3494
Reputation: 269481
Here are some alternative approaches. All of them can be generalized to arbitrary bounds by replacing 3 and 5 with other numbers.
1) strsplit Split the strings at underscore and use paste
to collapse it back using the appropriate separators. No packages are used.
i <- 3
j <- 5
sapply(strsplit(exdata, "_"), function(x) {
g <- seq_along(x)
g[g < i] <- i
g[g > j + 1] <- j+1
paste(tapply(x, g, paste, collapse = "_"), collapse = "-")
})
giving:
[1] "i_am_having-trouble-with-this_string"
[2] "i_am_wishing-files-were-cleaner_for_me"
[3] "any_help_would-be-greatly-appreciated"
2) for loop This translates the first j occurrences of old
to new
in x
and then translates the first i-1 occurrences of new
back to old
. No packages are used.
translate <- function(old, new, x, i = 1, j) {
if (i <= 1) {
if (j > 0) for(k in seq_len(j)) x <- sub(old, new, x, fixed = TRUE)
x
} else Recall(new, old, Recall(old, new, x, 1, j), 1, i-1)
}
translate("_", "-", exdata, 3, 5)
giving:
[1] "i_am_having-trouble-with-this_string"
[2] "i_am_wishing-files-were-cleaner_for_me"
[3] "any_help_would-be-greatly-appreciated"
3) gsubfn This uses a package but in return is substantially shorter than the others. gsubfn
is like gsub
except that the replacement string in gsub
can be a string, list, function or proto object. In the case of a proto object the fun
method of the proto object is invoked each time there is a match to the regular expression. Below the matching string is passed to fun
as x
while the output of fun
replaces the match in the data. The proto object is automatically populated with a number of variables set by gsubfn
and accessible by fun
including count
which is 1 for the first match, 2 for the second and so on. For more information see the gsubfn vignette -- section 4 discusses the use of proto objects.
library(gsubfn)
p <- proto(i = 3, j = 5,
fun = function(this, x) if (count >= i && count <= j) "-" else x)
gsubfn("_", p, exdata)
giving:
[1] "i_am_having-trouble-with-this_string"
[2] "i_am_wishing-files-were-cleaner_for_me"
[3] "any_help_would-be-greatly-appreciated"
Upvotes: 4
Reputation: 447
> gsub('(.*_.*_.*?)_(.*?)_(.*?)_(.*)','\\1-\\2-\\3-\\4', exdata)
[1] "i_am_having-trouble-with-this_string" "i_am_wishing-files-were-cleaner_for_me" "any_help_would-be-greatly-appreciated"
Upvotes: 2