lg929
lg929

Reputation: 234

How do I count the number of pattern occurrences, if the pattern includes NA, in R?

I have a string of 0's, 1's and NA's like so:

string<-c(0,1,1,0,1,1,NA,1,1,0,1,1,NA,1,0,
          0,1,0,1,1,1,NA,1,0,1,NA,1,NA,1,0,1,0,NA,1)

I'd like to count the number of times the PATTERN "1-NA-1" occurs. In this instance, I would like get the count 5.

I've tried table(string), and trying to replicate this but nothing seems to work. I would appreciate anyone's help!

Upvotes: 1

Views: 114

Answers (2)

Rick
Rick

Reputation: 898

# some ugly code, but it seems to work
sum( head(string, -2) == 1 & is.na(head(string[-1],-1)) 
                           & string[-1:-2] == 1, na.rm = TRUE)

Upvotes: 2

slamballais
slamballais

Reputation: 3235

Something like:

x <- which(is.na(string))
x <- x[!x %in% c(1,length(string))]
length(x[string[x-1] & string[x+1]])
# [1] 5

-- REASONING --

First, we check which values of string are NA with is.na(string). Then we find those indices with which and store them in x.

As @Rick mentions, if the first/last value is NA it would lead to problems in our next step. So, we make sure that those are removed (as it shouldn't count anyway).

Next, we want to find the situation where both string[x-1] and string[x+1] are 1. In other words, 1 & 1. Note that FALSE and TRUE can be evaluated as 0 and 1 respectively. So, if you type 1 == TRUE you will get TRUE. If you type 1 & 1 you will also get TRUE back. So, string[x-1] & string[x+1] will return TRUE when both are 1, and FALSE otherwise. We basically obtain a logical vector, and subset x with that vector to get all positions in x that satisfy our search. Then we use length to determine how many there are.

Upvotes: 2

Related Questions