Create frequency table per group

Question

In R, I have a data.frame with an ID, period, and a couple of variables of which the year is given.

df <- structure(list(id = 10001:10010, 
                     period = structure(c(1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), 
                                        .Label = c("2017-01", "2017-02"), class = "factor"), 
                     year1 = c(2013, 2014, 2014, 2017, 1970, 2014, 1970, 2014, 2014, 1970), 
                     year2 = c(2015, NA, 2015, 2015, 1970, 1970, 2014, 2017, 2014, 2017), 
                     year3 = c(2015, 2014, 2015, NA, NA, 2014, 2014, 2015, 1970, 2014), 
                     year4 = c(1970, 2014, 2013, 2014, 1970, 1970, 2014, 2015, 2015, NA)), 
                .Names = c("id", "period", "year1", "year2", "year3", "year4"), 
                row.names = c(NA, -10L), 
                class = "data.frame")

What I would like to achieve, is to create a data.frame in which, for every period and for everyvariable (year1-year4), a frequency is shown how often a specific year (1970,2013-2017) is shown. The desired result should be something like this:

result <- structure(list(period = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), 
                                            .Label = c("2017-01", "2017-02"), 
                                            class = "factor"), 
                         category = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), 
                                              .Label = c("year1", "year2", "year3", "year4"), 
                                              class = "factor"), 
                         X1970 = c(1L, 1L, 0L, 2L, 2L, 1L, 1L, 1L), 
                         X2013 = c(1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), 
                         X2014 = c(3L, 0L, 2L, 1L, 2L, 2L, 2L, 2L), 
                         X2015 = c(0L, 3L, 3L, 1L, 0L, 0L, 0L, 1L), 
                         X2017 = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L)), 
                    .Names = c("period", "category", "X1970", "X2013", "X2014", "X2015", "X2017"), 
                    class = "data.frame", 
                    row.names = c(NA, -8L))

I've got medium experience in R, and I'm confident this can be achieved. But I'm completely lost on how to approach this problem. Suggestions/ideas?

Manuel Bickel · Accepted Answer

If I understand your problem correctly you can achieve this with the package reshape2 by first turning the data from "wide" to "long" format via melt and then turning it back via cast while selecting the desired categories - I hope this is what you need.

df <- structure(list(id = 10001:10010, 
                     period = structure(c(1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), 
                                        .Label = c("2017-01", "2017-02"), class = "factor"), 
                     year1 = c(2013, 2014, 2014, 2017, 1970, 2014, 1970, 2014, 2014, 1970), 
                     year2 = c(2015, NA, 2015, 2015, 1970, 1970, 2014, 2017, 2014, 2017), 
                     year3 = c(2015, 2014, 2015, NA, NA, 2014, 2014, 2015, 1970, 2014), 
                     year4 = c(1970, 2014, 2013, 2014, 1970, 1970, 2014, 2015, 2015, NA)), 
                .Names = c("id", "period", "year1", "year2", "year3", "year4"), 
                row.names = c(NA, -10L), 
                class = "data.frame")



library(reshape2)
dcast(melt(df[,-1]), period + variable ~ value)

#     period variable 1970 2013 2014 2015 2017 NA
# 1 2017-01    year1    1    1    3    0    1  0
# 2 2017-01    year2    1    0    0    3    2  0
# 3 2017-01    year3    0    0    2    3    0  1
# 4 2017-01    year4    2    1    1    1    0  1
# 5 2017-02    year1    2    0    2    0    0  0
# 6 2017-02    year2    1    0    2    0    0  1
# 7 2017-02    year3    1    0    2    0    0  1
# 8 2017-02    year4    1    0    2    1    0  0

Create frequency table per group

Answers (2)

Related Questions