Reputation: 1642
I would like to compute the number of observations (Persons in my following example) that have non-missing values.
unbal <- data.frame(PERSON=c(rep('Frank',5),rep('Tony',5),rep('Edward',5)), YEAR=c(2001,2002,2003,2004,2005,2001,2002,2003,2004,2005,2001,2002,2003,2004,2005), Y=c(21,22,23,24,25,5,6,NA,7,8,31,32,33,34,35), X=c(1:15))
unbal
PERSON YEAR Y X
1 Frank 2001 21 1
2 Frank 2002 22 2
3 Frank 2003 23 3
4 Frank 2004 24 4
5 Frank 2005 25 5
6 Tony 2001 5 6
7 Tony 2002 6 7
8 Tony 2003 NA 8
9 Tony 2004 7 9
10 Tony 2005 8 10
11 Edward 2001 31 11
12 Edward 2002 32 12
13 Edward 2003 33 13
14 Edward 2004 34 14
15 Edward 2005 35 15
In this case will be 2, since only two persons (Frank and Edward) have all the data.
Upvotes: 2
Views: 116
Reputation: 38500
We can use anyNA
, which will can operate on data.frames together with by
. Prepending the !
operator negates the results to return the desired values.
Using by
,
!by(unbal, unbal["PERSON"], FUN=anyNA)
PERSON: Edward
[1] TRUE
----------------------------------------------------------------------------------
PERSON: Frank
[1] TRUE
----------------------------------------------------------------------------------
PERSON: Tony
[1] FALSE
or to return a named vector, wrap it in c
.
!c(by(unbal, unbal["PERSON"], FUN=anyNA))
Edward Frank Tony
TRUE TRUE FALSE
to calculate the number of persons with no missing values, wrap this in sum
sum(!c(by(unbal, unbal["PERSON"], FUN=anyNA)))
[1] 2
A modification of sotos's method, we can use anyNA
like this.
!sapply(split(unbal, unbal$PERSON), anyNA)
Edward Frank Tony
TRUE TRUE FALSE
Upvotes: 1
Reputation: 886938
We can use data.table
library(data.table)
setDT(unbal)[, .(ind = all(complete.cases(.SD))), PERSON]
and if we need the 'PERSON', just extract it
setDT(unbal)[, .(ind = all(complete.cases(.SD))), PERSON][(ind), PERSON]
#[1] Frank Edward
and if we need the total number
setDT(unbal)[, .(ind = all(complete.cases(.SD))), PERSON][, sum(ind)]
#[1] 2
Upvotes: 2
Reputation: 51582
One way via base R,
sapply(split(unbal, unbal$PERSON), function(i) all(complete.cases(i)))
#Edward Frank Tony
# TRUE TRUE FALSE
You can do this to extract,
ind <- sapply(split(unbal, unbal$PERSON), function(i) all(complete.cases(i)))
names(ind)[ind]
#[1] "Edward" "Frank"
#or for the length
length(ind[ind])
#[1] 2
Upvotes: 4
Reputation: 171
You can try that:
length(unique(unbal$PERSON[!unbal$PERSON%in%unbal[!complete.cases(unbal),1]]))
# [1] 2
Upvotes: 3
Reputation: 308
I would do it like this :
cp = 0
for (i in unique(unbal$PERSON)){
new_data = unbal[which(unbal$PERSON == i),]
if (anyNA(new_data) == FALSE){
cp = cp+1
}else{
cp = cp
}
}
cp
Upvotes: 1