Reputation: 1232
From a dataframe I get a new array, sliced from a dataframe. I want to get the amount of times a certain repetition appears on it.
For example
main <- c(A,B,C,A,B,V,A,B,C,D,E)
p <- c(A,B,C)
q <- c(A,B)
someFunction(main,p)
2
someFunction(main,q)
3
I've been messing around with rle
but it counts every subrepetion also, undersirable.
Is there a quick solution I'm missing?
Upvotes: 1
Views: 476
Reputation: 20282
Here's a way to do it using embed(v,n)
, which returns a matrix of all n
-length sub-sequences of vector v
:
find_x_in_y <- function(x, y)
sum( apply( embed( y, length(x)), 1,
identical, rev(x)))
> find_x_in_y(p, main)
[1] 2
> find_x_in_y(q, main)
[1] 3
Upvotes: 2
Reputation: 179448
Using sapply
:
find_x_in_y <- function(x, y){
sum(sapply(
seq_len(length(y)-length(x)),
function(i)as.numeric(all(y[i:(i+length(x)-1)]==x))
))
}
find_x_in_y(c("A", "B", "C"), main)
[1] 2
find_x_in_y(c("A", "B"), main)
[1] 3
Upvotes: 2
Reputation: 69201
You can use one of the regular expression tools in R since this is really a pattern matching exercise, specifically gregexpr
for this question. The p
and q
vectors represent the search pattern and main
is where we want to search for those patterns. From the help page for gregexpr
:
gregexpr returns a list of the same length as text each element of which is of
the same form as the return value for regexpr, except that the starting positions
of every (disjoint) match are given.
So we can take the length of the first list returned by gregexpr
which gives the starting positions of the matches. We'll first collapse the vectors and then do the searching:
someFunction <- function(haystack, needle) {
haystack <- paste(haystack, collapse = "")
needle <- paste(needle, collapse = "")
out <- gregexpr(needle, haystack)
out.length <- length(out[[1]])
return(out.length)
}
> someFunction(main, p)
[1] 2
> someFunction(main, q)
[1] 3
Note - you also need to throw "" around your vector main
, p
, and q
vectors unless you have variables A, B, C, et al defined.
main <- c("A","B","C","A","B","V","A","B","C","D","E")
p <- c("A","B","C")
q <- c("A","B")
Upvotes: 4
Reputation: 66852
I'm not sure if this is the best way, but you can simply do that work by:
f <- function(a,b)
if (length(a) > length(b)) 0
else all(head(b, length(a)) == a) + Recall(a, tail(b, -1))
Someone may or may not find a built-in function.
Upvotes: 3