Reputation: 3195
Suppose this is my dataset
(dput)
dataset<-structure(list(group1 = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L), .Label = c("b", "x"), class = "factor"), group2 = structure(c(2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("g", "y"), class = "factor"),
var1 = c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L)), .Names = c("group1",
"group2", "var1"), class = "data.frame", row.names = c(NA, -9L
))
I need calculate frequency for two groups
x+y
b+g
and for variable var1, calculate count of 1 value and 2 value. For each group. So the desired output
total_count_of_group var1-1 var1-2
x y 5
3 2
b g 4 2 2
This output mean that total_count_of_group x+y=5 obs. by this group. where 1 value meets 3 times and 2 value two times.
similary total_count_of_group b+g=4 obs. by this group. where 1 value meets 2 times and 2 value two times.
How to get such table?
Upvotes: 3
Views: 2748
Reputation: 47300
Here's a tidyverse
solution :
library(tidyverse)
dataset %>%
group_by(group1, group2) %>%
summarize(total = n(), x = list(table(var1) %>% as_tibble %>% spread(var1,n))) %>%
unnest
# # A tibble: 2 x 5
# # Groups: group1 [2]
# group1 group2 total `1` `2`
# <fct> <fct> <int> <int> <int>
# 1 b g 4 2 2
# 2 x y 5 3 2
Upvotes: 1
Reputation: 886938
Here is an option using base R
out <- aggregate(cbind(var = rep(1, nrow(df1))) ~ .,
transform(df1, counts = ave(var1, group1, group2, FUN = length)), length)
reshape(out, idvar = c('group1', 'group2', 'counts'),
timevar= 'var1', direction= 'wide')
# group1 group2 counts var.1 var.2
#1 b g 4 2 2
#3 x y 5 3 2
Upvotes: 1
Reputation: 42544
This can be solved in two steps:
dataset
using data.table
:
library(data.table)
dcast(setDT(dataset)[, total_count_of_group := .N, by =. (group1, group2)],
group1 + group2 + total_count_of_group~ paste0("var1=", var1), length)
group1 group2 total_count_of_group var1_1 var1_2 1: b g 4 2 2 2: x y 5 3 2
Please, note that this will work for an arbitrary number of different values in var1
as well as an arbitrary number of groups.
Upvotes: 4
Reputation: 16121
library(tidyverse)
dataset %>%
group_by(group1, group2) %>% # for each combination of groups
mutate(counts = n()) %>% # count number of rows
count(group1, group2, var1, counts) %>% # count unique combinations
spread(var1, n, sep = "_") %>% # reshape dataset
ungroup() # forget the grouping
# # A tibble: 2 x 5
# group1 group2 counts var1_1 var1_2
# <fct> <fct> <int> <int> <int>
# 1 b g 4 2 2
# 2 x y 5 3 2
Upvotes: 3
Reputation: 4970
You could produce three tables, select the relevant counts, and then combine in a dataframe.
a <- table(dataset$group1, dataset$group2)
b <- table(dataset$var1[dataset$group1=='x'])
d <- table(dataset$var1[dataset$group1=='b'])
data.frame(total_count_of_group = c(a[2,2], a[1,1]),
var1_1 = c(b[1], b[2]),
var1_2 = c(d[1], d[2]))
total_count_of_group var1_1 var1_2
1 5 3 2
2 4 2 2
Upvotes: 1