Forest
Forest

Reputation: 721

Count the number of non-zero elements of each column

Very new to R and I have a .rda file that contains a matrix of gene IDs and counts for each ID in 96 columns. It looks like this:

enter image description here

I want to get separate counts for the number of non-zero items in each column. I've been trying the sum() function in a loop, but perhaps I don't understand loop syntax in R. Any help appreciated. Thanks!

Forest

Upvotes: 46

Views: 126842

Answers (4)

Viktor Horváth
Viktor Horváth

Reputation: 139

There is a way to count the number of columns that have zeros. This one uses dplyr.

First, data.frame operation mode needs to be rowwise() then, columns must be subset with c_across() which returns a vector, that can be used in any function that takes vectors. Finally the values are assigned to a new column using mutate().

library(dplyr)

df <- data.frame(a = sample(0:10, 100, replace = T),
                 b = sample(0:10, 100, replace = T),
                 c = sample(0:10, 100, replace = T))

df %>%
rowwise() %>%
mutate(`N_zeros` = sum(c_across(everything()) == 0))

This idea can also be modified for any other operation that would take all or a subset of columns for row-wise operation.

See documentation of c_across() for more details. Tested with dplyr version 1.0.6.

Upvotes: 1

Ayse Ozhan
Ayse Ozhan

Reputation: 81

with x being a column or vector;

length(which(x != 0))

Upvotes: 8

maloneypatr
maloneypatr

Reputation: 3622

Another method using plyr's numcolwise:

library(plyr)

dat <- data.frame(a = sample(1:25, 25),
                  b = rep(0, 25),
                  c = sample(1:25, 25))
nonzero <- function(x) sum(x != 0)
numcolwise(nonzero)(dat)
   a b  c
1 25 0 25

Upvotes: 4

Jealie
Jealie

Reputation: 6267

What about:

apply(your.matrix, 2, function(c)sum(c!=0))

Does this help?

edit:

Even better:

colSums(your.matrix != 0)

edit 2:

Here we go, with an example for ya:

> example = matrix(sample(c(0,0,0,100),size=70,replace=T),ncol=7)
> example
      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
 [1,]    0  100    0    0  100    0  100
 [2,]  100    0    0    0    0    0  100
 [3,]    0    0    0    0    0    0  100
 [4,]    0  100    0    0    0    0    0
 [5,]    0    0  100  100    0    0    0
 [6,]    0    0    0  100    0    0    0
 [7,]    0  100  100    0    0    0    0
 [8,]  100    0    0    0    0    0    0
 [9,]  100  100    0    0  100    0    0
[10,]    0    0    0    0    0  100    0
> colSums(example != 0)
[1] 3 4 2 2 2 1 3

(new example, the previous example with '1' values was not suited to show that we are summing the number of cells, not their contents)

Upvotes: 79

Related Questions