Reputation: 1
I work with R, and I have a table xy like this
View( xy)
X Y
21 A
33 B
24 B
16 A
25 B
31 A
17 B
14 A
Now, I want to make groups of x and y and frequencies in steps of 10 like this at the end
Class A B
I (1-10) 0 0
II (11-20) 2 1
III (21-30) 1 2
And so on
Upvotes: 0
Views: 95
Reputation: 39858
One tidyverse
possibility could be:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0)
Class A B
<dbl> <dbl> <dbl>
1 0 0 0
2 1 2 1
3 2 1 2
4 3 1 1
Or if you want also the ranges:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste(Class * 10 + 1,
lead(Class * 10, default = ((last(Class) + 1) * 10)),
sep = "-"))
Class A B
<chr> <dbl> <dbl>
1 1-10 0 0
2 11-20 2 1
3 21-30 1 2
4 31-40 1 1
Or if you want the exact output you provided:
df %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste0("(",
Class * 10 + 1,
"-",
lead(Class * 10, default = ((last(Class) + 1) * 10)),
")"),
Class = paste(as.roman(row_number()), Class, sep = " "))
Class A B
<chr> <dbl> <dbl>
1 I (1-10) 0 0
2 II (11-20) 2 1
3 III (21-30) 1 2
4 IV (31-40) 1 1
Or a possibility for the cases when X == 0:
df %>%
filter(X > 0) %>%
mutate(Class = X %/% 10) %>%
count(Y, Class) %>%
group_by(Y) %>%
complete(Class = seq(0, max(Class), 1)) %>%
spread(Y, n, fill = 0) %>%
mutate(Class = paste0("(",
Class * 10 + 1,
"-",
lead(Class * 10, default = ((last(Class) + 1) * 10)),
")"),
Class = paste(as.roman(row_number()), Class, sep = " "))
Upvotes: 1
Reputation: 269371
First create the labels using either the commented out hard coded labels or the computed labels lab
. Then use cut
and table
to create the resulting table.
# lab <- c("I (1-10)", "II (11-20)", "III (21-30)", "IV (31-40)")
n <- ceiling(max(DF$X) / 10) # 4
bounds <- seq(0, 10*n, 10) # c(0, 10, 20, 30, 40)
lab <- sprintf("%s (%d-%d)", as.roman(1:n), head(bounds, -1) + 1, bounds[-1])
Class <- cut(DF$X, bounds, lab = lab)
table(Class, Y = DF$Y)
giving:
Y
Class A B
I (1-10) 0 0
II (11-20) 2 1
III (21-30) 1 2
IV (31-40) 1 1
We assume the input data frame DF
is the following shown in reproducible form:
Lines <- "
X Y
21 A
33 B
24 B
16 A
25 B
31 A
17 B
14 A"
DF <- read.table(text = Lines, header = TRUE)
Upvotes: 3