Reputation: 65
I'm totally new to the community and hope my question and example meet the criteria.
I've got a dataframe with two character vectors. The values in vector a vary in length, the values in vector b all consist of exactly one character.
a <- as.character(c("tsm", "skr", "fl", "pfl", "ts", "St", "S"))
b <- as.character(c("m", "k", "l", "l", "s", "t", "S"))
uedf <- data.frame(a, b)
I want to extract the character in a directly to the left of a character that is specified in vector b. The position of that character within the string can vary. So, from the first string, I want to extract "s" (left of m), in the second again "s" (left of k) and so on.
As I couldn't figure out how to do this using grepl()
(I'm not very familiar with regex), I finally ended up with a combination of strsplit()
and str_sub()
.
str_sub(strsplit(uedf$a,split=uedf$b, fixed=FALSE), start = -1, end = -1)
This works well for most cases except the second where it returns ")" instead of the desired "s".
[1] "s" ")" "f" "f" "t" "S" ""
Any ideas why this might be and how I could solve the problem? Thanks in advance!
Upvotes: 2
Views: 903
Reputation: 992
Here I locate positions that match your index and save them in i
. Then extract the characters one less then i
.
i <- mapply(regexpr, b, a) - 1
substr(a, i, i)
[1] "s" "s" "f" "f" "t" "S" ""
Upvotes: 2
Reputation: 680
I think str_sub
only works with strings but for the second string strsplit
gives you a vector of 2 strings.
This would do the job in the case the separator only appears once in every string:
sapply(strsplit(a,split=b, fixed=FALSE), function(l) str_sub(l[[1]],-1,-1))
Upvotes: 2
Reputation: 50668
Here is a solution using base R's gsub
:
sapply(1:length(a), function(i) ifelse(
nchar(a[i]) > 1,
gsub(paste0("^.*(\\w)", b[i], ".*$"), "\\1", a[i]),
""))
#[1] "s" "s" "f" "f" "t" "S" ""
Or even more concise and cleaner/neater using mapply
(thanks to @thelatemail):
mapply(function(a,b) ifelse(
nchar(a) > 1,
gsub(paste0("^.*(\\w)", b, ".*$"), "\\1", a),
""), a, b)
Upvotes: 2