P.R
P.R

Reputation: 300

count the number of non-NA elements in each row of a data frame before a condition becomes false

Say I have the following data set:

mydf <- data.frame(serial = c(1:3),
                   A = c(NA,"2011-01-01","2011-02-01"),
                   B = c("2010-12-01","2011-01-01","2011-02-01"),
                   C = c("2010-12-01","2011-01-01",NA)
                    )

There is another vector called limit

limit <- c("2011-02-10","2011-03-01","2011-01-12")

Think of the limit as a threshold date for each row of mydf. I would like the count the number of non-zero/non-NA occasions for each row of mydf BEFORE the threshold date. In this case, if I were to store the result in a vector called occasions, it would have the following elements: 2 , 3, 2.

Note: The elements under each column are obviously dates in YYYY-mm-dd format.

Upvotes: 0

Views: 318

Answers (1)

IRTFM
IRTFM

Reputation: 263342

colSum-ing a set of logical vectors created by "<":

occasions <- colSums( sapply(mydf[-1], as.Date, format="%Y-%d-%m") < 
                                            as.Date(limit, format="%Y-%d-%m"), 
                     na.rm=TRUE )
occasions
#------
A B C 
2 3 2 

as.Date needed to enforce the logic for logical comparisons, although character comparisons should work if all the values are truly 'YYYY-MM-DD'.

Upvotes: 1

Related Questions