Jonathan Neve
Jonathan Neve

Reputation: 25

Counting with zero values in the apply function

I am attempting to use the count with zero occurrences based on a defined list within the apply function. I have managed to do these separately, but would ideally like to have them in a single line. Here is my aim:

list <- c("x", "y", "z")

df       
    V1   V2   V3
    x    y    y
    x    x    z
    y    z    z

Desired result

     V1    V2   V3
 x   2     1    0
 y   1     1    1
 z   0     1    2

So I managed to do this for an individual column:

out <- table(factor(df$V1,levels=list))

And for all columns without defining the list (so no zero occurrences)

occurences <- (apply(df,2,(table)))

So ideally I want one inside the other, such as:

occurences <- as.data.frame(apply(df,2,(table(factor(df,levels=list)))))

Sadly with this however R gets upset and says (table(factor(df,levels=list) is not a function. Any help would be greatly appreciated.

Upvotes: 1

Views: 374

Answers (2)

Colonel Beauvel
Colonel Beauvel

Reputation: 31181

You are almost there, as the error says, you just need to define a function in apply:

apply(df, 2, function(u) table(factor(u, levels=vec)))
#  V1 V2 V3
#x  2  1  0
#y  1  1  1
#z  0  1  2

You can also use lapply function which iterates over the columns of your data.frame:

do.call(rbind,lapply(df, function(u) table(factor(u, levels=vec))))
#   x y z
#V1 2 1 0
#V2 1 1 1
#V3 0 1 2

Note that naming a vector "list" is really misleading. list is moreover a keyword of R langage so I renamed your vector "vec".

Data:

vec = c("x", "y", "z")

df = structure(list(V1 = structure(c(1L, 1L, 2L), .Label = c("x", 
"y"), class = "factor"), V2 = structure(c(2L, 1L, 3L), .Label = c("x", 
"y", "z"), class = "factor"), V3 = structure(c(1L, 2L, 2L), .Label = c("y", 
"z"), class = "factor")), .Names = c("V1", "V2", "V3"), row.names = c(NA, 
-3L), class = "data.frame")

Upvotes: 1

EDi
EDi

Reputation: 13310

Here is my solution, using plyrs rbind.fill:

df <- read.table(header = TRUE, text = '   V1   V2   V3
x    y    y
x    x    z
y    z    z')

require(plyr)
out <- rbind.fill(lapply(df, function(x) as.data.frame.matrix(t(table(x)))))
out[is.na(out)] <- 0
out
#   x y z
# 1 2 1 0
# 2 1 1 1
# 3 0 1 2

Upvotes: 0

Related Questions