Reputation: 225
I've got a matrix with "A", "B" and NA
values, and I would like to count the number of "A" or "B" or NA
values in every column.
sum(mydata[ , i] == "A")
and
sum(mydata[ , i] == "B")
worked fine for columns without NA
. For columns that contain NA
I can count the number of NA
s with sum(is.na(mydata[ , i])
. In these columns sum(mydata[ , i] == "A")
returns NA
as a result instead of a number.
How can i count the number of "A" values in columns which contain NA
values?
Thanks for your help!
Example:
> mydata
V1 V2 V3 V4
V2 "A" "A" "A" "A"
V3 "A" "A" "A" "A"
V4 "B" "B" NA NA
V5 "A" "A" "A" "A"
V6 "B" "A" "A" "A"
V7 "B" "A" "A" "A"
V8 "A" "A" "A" "A"
sum(mydata[ , 2] == "A")
# [1] 6
sum(mydata[ , 3] == "A")
# [1] NA
sum(is.na(mydata[ , 3]))
# [1] 1
Upvotes: 4
Views: 13301
Reputation: 1
A quick way to do this is to do summary stats for the variable:
summary(mydata$my_variable) of table(mydata$my_variable)
This will give you the number of missing variables.
Hope this helps
Upvotes: 0
Reputation: 1
Another possibility is to convert the column in a factor and then to use the function summary. Example:
vec<-c("A","B","A",NA)
summary(as.factor(vec))
Upvotes: 0
Reputation: 31
Not sure if this is what you are after. RnewB too so check if this working. Difference between the number of rows and your number of rows will tell you number of NA items.
colSums(!is.na(mydata))
Upvotes: 3
Reputation: 19454
To expand on the answer from @Andrie,
mydata <- matrix(c(rep("A", 8), rep("B", 2), rep(NA, 2), rep("A", 4),
rep(c("B", "A", "A", "A"), 2), rep("A", 4)), ncol = 4, byrow = TRUE)
myFun <- function(x) {
data.frame(n.A = sum(x == "A", na.rm = TRUE), n.B = sum(x == "B",
na.rm = TRUE), n.NA = sum(is.na(x)))
}
apply(mydata, 2, myFun)
Upvotes: 0
Reputation: 179558
The function sum
(like many other math functions in R) takes an argument na.rm
. If you set na.rm=TRUE
, R removes all NA
values before doing the calculation.
Try:
sum(mydata[,3]=="A", na.rm=TRUE)
Upvotes: 7