Jenny
Jenny

Reputation: 1

Aggregate frequency classification table

I work with R, and I have a table xy like this

View( xy)
X             Y
21           A
33           B
24           B
16           A
25           B
31           A
17           B
14           A

Now, I want to make groups of x and y and frequencies in steps of 10 like this at the end


Class                  A          B
I (1-10)               0          0
II (11-20)             2          1
III (21-30)            1          2

And so on

Upvotes: 0

Views: 95

Answers (2)

tmfmnk
tmfmnk

Reputation: 39858

One tidyverse possibility could be:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) 

  Class     A     B
  <dbl> <dbl> <dbl>
1     0     0     0
2     1     2     1
3     2     1     2
4     3     1     1

Or if you want also the ranges:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste(Class * 10 + 1, 
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      sep = "-"))

   Class     A     B
  <chr> <dbl> <dbl>
1 1-10      0     0
2 11-20     2     1
3 21-30     1     2
4 31-40     1     1

Or if you want the exact output you provided:

df %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste0("(",
                       Class * 10 + 1, 
                      "-",
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      ")"),
        Class = paste(as.roman(row_number()), Class, sep = " "))

  Class           A     B
  <chr>       <dbl> <dbl>
1 I (1-10)        0     0
2 II (11-20)      2     1
3 III (21-30)     1     2
4 IV (31-40)      1     1

Or a possibility for the cases when X == 0:

df %>%
 filter(X > 0) %>%
 mutate(Class = X %/% 10) %>%
 count(Y, Class) %>%
 group_by(Y) %>%
 complete(Class = seq(0, max(Class), 1)) %>%
 spread(Y, n, fill = 0) %>%
 mutate(Class = paste0("(",
                       Class * 10 + 1, 
                      "-",
                      lead(Class * 10, default = ((last(Class) + 1) * 10)),
                      ")"),
        Class = paste(as.roman(row_number()), Class, sep = " "))

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 269371

First create the labels using either the commented out hard coded labels or the computed labels lab. Then use cut and table to create the resulting table.

# lab <- c("I (1-10)", "II (11-20)", "III (21-30)", "IV (31-40)")
n <- ceiling(max(DF$X) / 10)  # 4
bounds <- seq(0, 10*n, 10)    # c(0, 10, 20, 30, 40)
lab <- sprintf("%s (%d-%d)", as.roman(1:n), head(bounds, -1) + 1, bounds[-1])

Class <- cut(DF$X, bounds, lab = lab)
table(Class, Y = DF$Y)

giving:

             Y
Class         A B
  I (1-10)    0 0
  II (11-20)  2 1
  III (21-30) 1 2
  IV (31-40)  1 1

Note

We assume the input data frame DF is the following shown in reproducible form:

Lines <- "
X            Y
21           A
33           B
24           B
16           A
25           B
31           A
17           B
14           A"
DF <- read.table(text = Lines, header = TRUE)

Upvotes: 3

Related Questions