Reputation: 3656
I have a vector, for which I want to check each element against each row of a data frame. It involves a grep function, since the elements to be checked are buried in other text.
With help of this forum, I got this code:
mat=data.frame(par=c('long A story','C story', 'blabla D'),val=1:3)
vec=c('Z','D','A')
mat$label <- NA
for (x in vec){
is.match <- lapply(mat$par,function(y) grep(x, y))
mat$label[which(is.match > 0)] <- x
}
The problem is that it takes minutes to execute. Is there a way to vectorize this?
Upvotes: 2
Views: 940
Reputation: 15395
I've assumed you only want the first match in each case:
which.matches <- grep("[ZDA]", mat$par)
what.matches <- regmatches(mat$par, regexpr("[ZDA]", mat$par))
mat$label[which.matches] <- what.matches
mat
par val label
1 long A story 1 A
2 C story 2 <NA>
3 blabla D 3 D
EDIT: Benchmarking
Unit: microseconds
expr min lq median uq max
1 answer(mat) 185.338 194.0925 199.073 209.1850 898.919
2 question(mat) 672.227 693.9610 708.601 725.6555 1457.046
EDIT 2:
As @mrdwab suggested, this can actually be used as a one-liner:
mat$label[grep("[ZDA]", mat$par)] <- regmatches(mat$par, regexpr("[ZDA]", mat$par))
Upvotes: 3