stats_noob
stats_noob

Reputation: 5925

Counting Variable Factor Types in Terms of Another Variable's Factors

I am working in R. I have the following 3 data sets:

set.seed(123)

v1 <- c("2010-2011","2011-2012", "2012-2013", "2013-2014", "2014-2015") 
v2 <- c("A", "B", "C", "D", "E")
v3 <- c("Z", "Y", "X", "W" )

data_1 = data.frame(var_1 = rnorm(871, 10,10), var_2 = rnorm(871, 5,5))

data_1$dates <- as.factor(sample(v1, 871, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_1$types <- as.factor(sample(v2, 871, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_1$types2 <- as.factor(sample(v3, 871, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))


data_2 = data.frame(var_1 = rnorm(412, 10,10), var_2 = rnorm(412, 5,5))

data_2$dates <- as.factor(sample(v1, 412, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_2$types <- as.factor(sample(v2, 412, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_2$types2 <- as.factor(sample(v3, 412, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

data_3 = data.frame(var_1 = rnorm(332, 10,10), var_2 = rnorm(332, 5,5))

data_3$dates <- as.factor(sample(v1, 332, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_3$types <- as.factor(sample(v2, 332, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_3$types2 <- as.factor(sample(v3, 332, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

Using the above data, I made a summary table:

summary_table = data.frame(names = c("data_1", "data_2", "data_3" ),
                           counts = c(nrow(data_1), nrow(data_2), nrow(data_3)  )
                           
)

> summary_table
   names counts
1 data_1    871
2 data_2    412
3 data_3    332

To the above table, I would like to add the breakdowns for "types" in terms of "types2"

I can do this manually for each individual set:

    library(dplyr)
    
    summary_1 = data.frame( data_1 %>%  group_by( types, types2) %>% summarise(my_counts = n()) )
    
    summary_2 = data.frame( data_2 %>%  group_by( types, types2) %>% summarise(my_counts = n()) )
    
    summary_3 = data.frame( data_3 %>%  group_by( types, types2) %>% summarise(my_counts = n()) )

#view sample

head(summary_1)

   types types2 my_counts
1      A      W        41
2      A      X        32
3      A      Y       176
4      A      Z        96
5      B      W        22
6      B      X        22

In the end, I would like to create something like this:

enter image description here

Does anyone know how to do this (automatically for all levels)?

Thanks!

Upvotes: 0

Views: 35

Answers (1)

dcarlson
dcarlson

Reputation: 11076

First combine the data frames:

data_1 <- data.frame(name="data_1", data_1)
data_2 <- data.frame(name="data_2", data_2)
data_3 <- data.frame(name="data_3", data_3)
data <- rbind(data_1, data_2, data_3)

Then create a 3D table:

summary <- xtabs(~name+types+types, data)

Then flatten the table:

ftable(summary, row.vars=1, col.vars=2:3)
#        types    A               B               C               D               E            
#        types2   W   X   Y   Z   W   X   Y   Z   W   X   Y   Z   W   X   Y   Z   W   X   Y   Z
# name                                                                                         
# data_1         26  29 172 104  27  20 111  48  12  10  64  32  12  10  43  33  15   9  56  38
# data_2         13  14  80  54   9  12  56  35   5   4  25  18   3   2  16  14   8   4  27  13
# data_3          6  11  62  48   7  12  38  24   6   2  20   8   6   5  19  14   7   3  27   7

Upvotes: 1

Related Questions