ATMA
ATMA

Reputation: 1468

Create a table from survey results

I have the following data and I was wondering how to generate a table of the frequency from each response via base, plyr, or another package.

My data:

df = data.frame(id = c(1,2,3,4,5),
                Did_you_use_tv=c("tv","","","tv","tv"),
                Did_you_use_internet=c("","","","int","int"))
df

I can run a table and get the frequencies for any column using the table

table(df[,2])
table(df[,2], df[,3])

However, how can I go about setting up the data so it looks like below.

df2 = data.frame(Did_you_use_tv=c(3), 
                Did_you_use_internet=c(2))
df2

It's just a summary of frequencies for each column.

I'm going to be creating cross tabs but given the structure of the data, I feel this may be a little more useful.

Upvotes: 0

Views: 332

Answers (4)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

This is similar in concept to @Tyler's answer. Just take the sum of all values that are not equal to "":

colSums(!df[-1] == "")
#       Did_you_use_tv Did_you_use_internet 
#                    3                    2 

Update

Fellow Stack Overflow user @juba has done some work on a function called multi.table which looks like this:

multi.table <- function(df, true.codes=NULL, weights=NULL) {
  true.codes <- c(as.list(true.codes), TRUE, 1)
  as.table(sapply(df, function(v) {
    sel <- as.numeric(v %in% true.codes)
    if (!is.null(weights)) sel <- sel * weights
    sum(sel)
  }))
}

The function is part of the questionr package.

Usage in your example would be:

library(questionr)
multi.table(df[-1], true.codes=list("tv", "int"))
#       Did_you_use_tv Did_you_use_internet 
#                    3                    2 

Upvotes: 2

Maiasaura
Maiasaura

Reputation: 32986

With plyr and reshape2

t(dcast(subset(melt(df,id.var="id"), value!=""), variable ~ .))

Upvotes: 0

Jilber Urbina
Jilber Urbina

Reputation: 61154

Here's another approach

> do.call(cbind, lapply(df[,-1], table))[-1, ]
      Did_you_use_tv Did_you_use_internet 
                   3                    2 

Upvotes: 1

Tyler Rinker
Tyler Rinker

Reputation: 109864

Here's one approach of many that came to mind:

FUN <- function(x) sum(x != "")
do.call(cbind, lapply(df[, -1], FUN))

##     Did_you_use_tv Did_you_use_internet
## [1,]              3                    2

Upvotes: 1

Related Questions