Reputation: 107
I have a categorical dataset that I am trying to summarize that has inherent differences in the nature of questions that were asked. The data below represent a questionnaire that had standard close-ended questions, but also questions where one could choose multiple answers from a list. "village" and "income" represent close-ended questions. "responsible.1"...etc... represent a list where the respondent either said yes or no to each.
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
j both DLNR NA DEQ NA Public
k regular.income DLNR NA NA NA NA
k regular.income DLNR CRM DEQ Mayor NA
l both DLNR NA NA Mayor NA
j both DLNR CRM NA Mayor NA
m regular.income DLNR NA NA NA Public
What I want is a 3-way table output with "village" and the suite of of "responsible" responsible variables wrapped up into a ftable
. This way, I could use the table with numerous R packages for graphs and analyses.
RESPONSIBLE
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
j both 2 1 1 1 1
k regular income 2 1 1 1 0
l both 1 0 0 1 0
m regular income 1 0 0 0 1
as.data.frame(table(village, responsible.1)
would get me the first, but I can't figure out how to get the entire thing wrapped up in a nice ftable
.
Upvotes: 0
Views: 124
Reputation: 263301
> aggregate(dat[-(1:2)], dat[1:2], function(x) sum(!is.na(x)) )
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1 j both 2 1 1 1 1
2 l both 1 0 0 1 0
3 k regular.income 2 1 1 1 0
4 m regular.income 1 0 0 0 1
I'm guessing you actually had another grouping vector , perhaps the first "responsible" column?
I don't really understand the sorting rules but reversing the order of the grouping columns may be closer to what you posted:
> aggregate(dat[-(1:2)], dat[2:1], function(x) sum(!is.na(x)) )
INCOME VILLAGE responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1 both j 2 1 1 1 1
2 regular.income k 2 1 1 1 0
3 both l 1 0 0 1 0
4 regular.income m 1 0 0 0 1
Upvotes: 1