Reputation: 353
I want to count number of zeros in each column in a R data frame and express it as a percentage. This percentage should be added to last row of the original data frame? example
x <- c(0, 4, 6, 0, 10)
y <- c(3, 0, 9, 12, 15)
z <- c(3, 6, 9, 0, 15)
data_a <- cbind(x,y,z)
want to see the zeros in each column and express as percentage
Thanks
Upvotes: 6
Views: 33063
Reputation: 34
This is probably inelegant, but this is how I went about it when my columns had NAs:
#Returns the number of zeroes in a column
numZero <- colSums(vars == 0, na.rm = T)
#Returns the number of non-NA entries in each column
numNA <- colSums(is.na(vars))
#Returns total sample size
numSamp <- rep(nrow(vars), ncol(vars))
#Combine the three
varCheck <- as.data.frame(cbind(numZero, numNA, numSamp))
#Number of observations for that variable
varCheck$numTotal <- varCheck$numSamp - varCheck$numNA
#Percentage zero
varCheck$pctZero <- varCheck$numZero / varCheck$numTotal
#Check which have lower than 1%
varCheck[which(varCheck$pctZero > 0.99),]
Upvotes: 0
Reputation: 2076
Here is one more method using lapply, this would work for a data frame though.
lapply(data_a, function(x){ length(which(x==0))/length(x)})
Upvotes: 8
Reputation: 61154
A combination of prop.table
and some *apply
work can give you the same answer as @Roland's
> prop <- apply(data_a, 2, function(x) prop.table(table(x))*100)
> rbind(data_a, sapply(prop, "[", 1))
x y z
[1,] 0 3 3
[2,] 4 0 6
[3,] 6 9 9
[4,] 0 12 0
[5,] 10 15 15
[6,] 40 20 20
Upvotes: 2
Reputation: 132706
x <- c(0, 4, 6, 0, 10)
y <- c(3, 0, 9, 12, 15)
z <- c(3, 6, 9, 0, 15)
data_a <- cbind(x,y,z)
#This is a matrix not a data.frame.
res <- colSums(data_a==0)/nrow(data_a)*100
If you must, rbind
to the matrix (usually not really a good idea).
rbind(data_a, res)
# x y z
# 0 3 3
# 4 0 6
# 6 9 9
# 0 12 0
# 10 15 15
# res 40 20 20
Upvotes: 14