anu
anu

Reputation: 295

Pattern Matching in R

I have a list like

> list(c("a","b","c","d"),c("b","c","e"))
[[1]]
[1] "a" "b" "c" "d"

[[2]]
[1] "b" "c" "e"

I have a sequence "bc". I want to match this pattern with my list and want to know the frequency of this pattern. Required Output: 2 First of all, I need to convert my list into this format c("abcd"),c("bce") so that I can do matching. How to convert and match? Second, how to calculate and store the frequency?

I was using grepl function but it returns logical value, not the count.

Upvotes: 0

Views: 775

Answers (2)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193687

Using @Tyler's sample data, you can use gregexpr:

lst <- list(c('a', 'b', 'c', 'd', 'b', 'c'),
            c('b', 'c', 'e'))
lst2 <- lapply(lst, paste, collapse="")
sapply(gregexpr("bc", lst2, fixed = TRUE), length)
# [1] 2 1

Upvotes: 2

Tyler Rinker
Tyler Rinker

Reputation: 110062

Here's one approach using term.count (a non exported function) from the qdap package:

lst <- list(c('a', 'b', 'c', 'd', 'b', 'c'),c('b', 'c', 'e'))
lst2 <- lapply(lst, paste, collapse="") #use lapply to paste the list

## install.packages("qdap")
sapply(lst2, qdap:::term.count, "bc") #count occurences

## > sapply(lst2, qdap:::term.count, "bc")
## bc bc 
##  2  1 

If you don't want to use qdap look at the source for term.count and take what you need.

Upvotes: 1

Related Questions