SnakesCantWearBoots
SnakesCantWearBoots

Reputation: 367

How to count rows in a logical vector

I have a data frame called source that looks something like this

185  2002-07-04      NA      NA 20
186  2002-07-05      NA      NA 20
187  2002-07-06      NA      NA 20
188  2002-07-07  14.400   0.243 20
189  2002-07-08      NA      NA 20
190  2002-07-09      NA      NA 20
191  2002-07-10      NA      NA 20
192  2002-07-11      NA      NA 20
193  2002-07-12      NA      NA 20
194  2002-07-13   4.550   0.296 20
195  2002-07-14      NA      NA 20
196  2002-07-15      NA      NA 20
197  2002-07-16      NA      NA 20
198  2002-07-17      NA      NA 20
199  2002-07-18      NA      NA 20
200  2002-07-19      NA   0.237 20

and when I try

> nrow(complete.cases(source))

I only get NULL

can someone explain why this is the case and how can I count how many rows there are without NA or NaN values?

Upvotes: 2

Views: 410

Answers (3)

MKR
MKR

Reputation: 20095

You can even try:

source[rowSums(is.na(source))==0,]
#     V1         V2    V3    V4 V5
# 4  188 2002-07-07 14.40 0.243 20
# 10 194 2002-07-13  4.55 0.296 20

nrow(source[rowSums(is.na(source))==0,])
#[1] 2

Upvotes: 0

Maurits Evers
Maurits Evers

Reputation: 50738

Instead use sum. Though the safest option would be NROW (because it can handle both data.frams and vectors)

sum(complete.cases(source))
#[1] 2

Or alternatively if you insist on using nrow

nrow(source[complete.cases(source), ])
#[1] 2

Explanation: complete.cases returns a logical vector indicating which cases (in your case rows) are complete.


Sample data

source <- read.table(text = 
    "185  2002-07-04      NA      NA 20
186  2002-07-05      NA      NA 20
187  2002-07-06      NA      NA 20
188  2002-07-07  14.400   0.243 20
189  2002-07-08      NA      NA 20
190  2002-07-09      NA      NA 20
191  2002-07-10      NA      NA 20
192  2002-07-11      NA      NA 20
193  2002-07-12      NA      NA 20
194  2002-07-13   4.550   0.296 20
195  2002-07-14      NA      NA 20
196  2002-07-15      NA      NA 20
197  2002-07-16      NA      NA 20
198  2002-07-17      NA      NA 20
199  2002-07-18      NA      NA 20
200  2002-07-19      NA   0.237 20")

Upvotes: 3

kath
kath

Reputation: 7734

complete.cases returns a logical vector that indicates the rows which are complete. As a vector doesn't have a row attribute, you cannot use nrow here, but as suggested by others sum. With sum the TRUE and FALSE are transformed to 1 and 0 internally, so using sum counts the TRUE values of your vector.

sum(complete.cases(source))
# [1] 2

If you however are more interested in the data.frame, which is left after you exclude all non-complete rows, you can use na.exclude. This returns a data.frame and you can use nrow.

nrow(na.exclude(source))
# [1] 2

na.exclude(source)
#             V2    V3    V4 V5
# 188 2002-07-07 14.40 0.243 20
# 194 2002-07-13  4.55 0.296 20

Upvotes: 0

Related Questions