Reputation: 225

Count number of elements meeting criteria in columns with NA values

I've got a matrix with "A", "B" and NA values, and I would like to count the number of "A" or "B" or NA values in every column.

sum(mydata[ , i] == "A")

and

sum(mydata[ , i] == "B")

worked fine for columns without NA. For columns that contain NA I can count the number of NAs with sum(is.na(mydata[ , i]). In these columns sum(mydata[ , i] == "A") returns NA as a result instead of a number.

How can i count the number of "A" values in columns which contain NA values?

Thanks for your help!

Example:

> mydata
    V1  V2  V3  V4 
V2 "A" "A" "A" "A"
V3 "A" "A" "A" "A"
V4 "B" "B" NA  NA 
V5 "A" "A" "A" "A"
V6 "B" "A" "A" "A"
V7 "B" "A" "A" "A"
V8 "A" "A" "A" "A"

sum(mydata[ , 2] == "A")
# [1] 6

sum(mydata[ , 3] == "A")
# [1] NA

sum(is.na(mydata[ , 3]))
# [1] 1

Upvotes: 4

Answers (6)

Frank Odhiambo

Reputation: 1

A quick way to do this is to do summary stats for the variable:

summary(mydata$my_variable) of table(mydata$my_variable)

This will give you the number of missing variables.

Hope this helps

Upvotes: 0

user4353689

Reputation: 1

Another possibility is to convert the column in a factor and then to use the function summary. Example:

vec<-c("A","B","A",NA)

summary(as.factor(vec))

Upvotes: 0

InMktgWeTrust

Reputation: 31

Not sure if this is what you are after. RnewB too so check if this working. Difference between the number of rows and your number of rows will tell you number of NA items.

colSums(!is.na(mydata))

Upvotes: 3

BenBarnes

Reputation: 19454

To expand on the answer from @Andrie,

mydata <- matrix(c(rep("A", 8), rep("B", 2), rep(NA, 2), rep("A", 4),
  rep(c("B", "A", "A", "A"), 2), rep("A", 4)), ncol = 4, byrow = TRUE)

myFun <- function(x) {
  data.frame(n.A = sum(x == "A", na.rm = TRUE), n.B = sum(x == "B",
    na.rm = TRUE), n.NA = sum(is.na(x)))
}

apply(mydata, 2, myFun)

Upvotes: 0

Sophia

Reputation: 1961

You can use table to count all your values at once.

Upvotes: -1

Andrie

Reputation: 179558

The function sum (like many other math functions in R) takes an argument na.rm. If you set na.rm=TRUE, R removes all NA values before doing the calculation.

Try:

sum(mydata[,3]=="A", na.rm=TRUE)

Upvotes: 7

Count number of elements meeting criteria in columns with NA values

Answers (6)

Related Questions