Stephen Clark
Stephen Clark

Reputation: 596

Calculate the levels of mutiple variables and return tabular result

I would like to put the output from a summary command into a data table. For example, with this data frame:

   Person     V1     V2     V3     V4
1       A medium medium medium   high
2       B medium medium    low    low
3       V   high   high medium medium
4       D medium medium    low   high
5       E   high   high medium    low
6       F medium medium    low    low
7       G   high   high    low   high
8       H medium    low medium    low
9       I medium medium    low medium
10      J medium    low medium    low

x.df<-structure(list(Person = structure(c(1L, 2L, 10L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L), .Label = c("A", "B", "D", "E", "F", "G", "H", 
"I", "J", "V"), class = "factor"), V1 = structure(c(2L, 2L, 1L, 
2L, 1L, 2L, 1L, 2L, 2L, 2L), .Label = c("high", "medium"), class = "factor"), 
V2 = structure(c(3L, 3L, 1L, 3L, 1L, 3L, 1L, 2L, 3L, 2L), .Label = c("high", 
"low", "medium"), class = "factor"), V3 = structure(c(2L, 
1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L), .Label = c("low", "medium"
), class = "factor"), V4 = structure(c(1L, 2L, 3L, 1L, 2L, 
2L, 1L, 2L, 3L, 2L), .Label = c("high", "low", "medium"), class = "factor")), .Names = c("Person", 
"V1", "V2", "V3", "V4"), class = "data.frame", row.names = c(NA, 
-10L))

with summary(x.df) I get the counts for each factor level:

     Person       V1         V2         V3         V4   
 A      :1   high  :3   high  :3   low   :5   high  :3  
 B      :1   medium:7   low   :2   medium:5   low   :5  
 D      :1              medium:5              medium:2  
 E      :1                                              
 F      :1                                              
 G      :1                                              
 (Other):4                                              

Ideally, I would like a data frame of the counts for each factor level, ie:

  Var low medium high
1  V1   0      7    3
2  V2   2      5    3
3  V3   5      5    0
4  V4   5      2    3

with row sums equal to the 10.

Upvotes: 1

Views: 45

Answers (2)

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a way using a helper function.
Note that the call to do.call is the second solution in the accepted answer to this question, the second link in the comment to the question by @shreyasgm. I have just changed cbind to rbind.

fun <- function(DF){
    nms <- names(DF)[-1]
    vals <- unlist(DF[-1])
    lv <- levels(unique(unlist(DF[-1])))
    DF[-1] <- lapply(DF[-1], function(x)  factor(x, levels = lv))
    do.call(rbind, lapply(DF[-1], summary))
}

fun(x.df)
#   high medium low
#V1    3      7   0
#V2    3      5   2
#V3    0      5   5
#V4    3      2   5

Upvotes: 1

lmo
lmo

Reputation: 38500

Here is a method of getting counts of each question variable into a matrix.

myMat <- sapply(x.df[-1],
                function(x) table(factor(x, levels=c("low", "medium", "high"))))

The idea is to use sapply to run through each of these variables, convert the variable to a factor with the desired levels, and then call table on the converted variable.

This returns

myMat
       V1 V2 V3 V4
low     0  2  5  5
medium  7  5  5  2
high    3  3  0  3

If you want to convert it to your desired output, just use t to transpose it:

t(myMat)
   low medium high
V1   0      7    3
V2   2      5    3
V3   5      5    0
V4   5      2    3

Upvotes: 2

Related Questions