Reputation: 367
I have a data frame called source that looks something like this
185 2002-07-04 NA NA 20
186 2002-07-05 NA NA 20
187 2002-07-06 NA NA 20
188 2002-07-07 14.400 0.243 20
189 2002-07-08 NA NA 20
190 2002-07-09 NA NA 20
191 2002-07-10 NA NA 20
192 2002-07-11 NA NA 20
193 2002-07-12 NA NA 20
194 2002-07-13 4.550 0.296 20
195 2002-07-14 NA NA 20
196 2002-07-15 NA NA 20
197 2002-07-16 NA NA 20
198 2002-07-17 NA NA 20
199 2002-07-18 NA NA 20
200 2002-07-19 NA 0.237 20
and when I try
> nrow(complete.cases(source))
I only get NULL
can someone explain why this is the case and how can I count how many rows there are without NA or NaN values?
Upvotes: 2
Views: 410
Reputation: 20095
You can even try:
source[rowSums(is.na(source))==0,]
# V1 V2 V3 V4 V5
# 4 188 2002-07-07 14.40 0.243 20
# 10 194 2002-07-13 4.55 0.296 20
nrow(source[rowSums(is.na(source))==0,])
#[1] 2
Upvotes: 0
Reputation: 50738
Instead use sum
. Though the safest option would be NROW
(because it can handle both data.frams and vectors)
sum(complete.cases(source))
#[1] 2
Or alternatively if you insist on using nrow
nrow(source[complete.cases(source), ])
#[1] 2
Explanation: complete.cases
returns a logical vector indicating which cases (in your case rows) are complete.
source <- read.table(text =
"185 2002-07-04 NA NA 20
186 2002-07-05 NA NA 20
187 2002-07-06 NA NA 20
188 2002-07-07 14.400 0.243 20
189 2002-07-08 NA NA 20
190 2002-07-09 NA NA 20
191 2002-07-10 NA NA 20
192 2002-07-11 NA NA 20
193 2002-07-12 NA NA 20
194 2002-07-13 4.550 0.296 20
195 2002-07-14 NA NA 20
196 2002-07-15 NA NA 20
197 2002-07-16 NA NA 20
198 2002-07-17 NA NA 20
199 2002-07-18 NA NA 20
200 2002-07-19 NA 0.237 20")
Upvotes: 3
Reputation: 7734
complete.cases
returns a logical vector that indicates the rows which are complete. As a vector doesn't have a row attribute, you cannot use nrow
here, but as suggested by others sum
. With sum
the TRUE
and FALSE
are transformed to 1
and 0
internally, so using sum
counts the TRUE
values of your vector.
sum(complete.cases(source))
# [1] 2
If you however are more interested in the data.frame
, which is left after you exclude all non-complete rows, you can use na.exclude
. This returns a data.frame
and you can use nrow
.
nrow(na.exclude(source))
# [1] 2
na.exclude(source)
# V2 V3 V4 V5
# 188 2002-07-07 14.40 0.243 20
# 194 2002-07-13 4.55 0.296 20
Upvotes: 0