Reputation: 833
I am currently learning R and I encountered problems with tabulating data.
I have integer scores in a data frame, model
, that range from 1 to 10 (inclusive). When I use the table function, i.e.
table(model$score)
I get the following result:
1 2 3 4 5 6 7 8 9 10
5 6 8 7 2 3 6 4 5 0
However, I want to tabulate the data in the following format:
1-2 3-4 5-6 7-8 9-10
11 15 5 10 5
Is it possible to achieve this with the table function or do I have to seek the help of another function/package? How do I do it then? Is there a similar way for the prop.table
function?
Thank you for your help.
Upvotes: 1
Views: 144
Reputation: 51592
You could also use zoo
package,
library(zoo)
rollapply(table(model$score), 2, by = 2, sum)
Using @Zheyuan Li's example, (updated as per @G.Grothendieck's comment)
tt <- rollapply(table(a), 2, by = 2, sum)
names(tt) <- rollapply(names(table(a)), 2, by = 2, paste, collapse = "-")
tt
# 1-2 3-4 5-6 7-8 9-10
# 29 45 43 47 36
Upvotes: 4
Reputation: 887291
Here is a faster option with RcppRoll
and tabulate
library(RcppRoll)
nm1 <- do.call(paste, c(as.data.frame(matrix(1:10, ncol=2, byrow=TRUE)), list(sep="-")))
setNames(roll_sum(tabulate(a),2)[c(TRUE, FALSE)], nm1)
# 1-2 3-4 5-6 7-8 9-10
# 29 45 43 47 36
Upvotes: 4
Reputation: 73345
Why not simply do this?
x <- table(model$score)
x <- x[c(1,3,5,7,9)] + x[c(2,4,6,8,10)]
names(x) <- c("1-2","3-4","5-6","7-8","9-10")
It does not introduce extra complexity at all.
table
will of course give you a vector of length-10, because you have 10 unique levels.
Well, if you insist calling table()
to get the result you want, you need to use cut()
to classify your data into bins:
set.seed(0); a <- sample(1:10, 200, replace = TRUE)
table(cut(a, breaks = c(0,2,4,6,8,10)))
(0,2] (2,4] (4,6] (6,8] (8,10]
29 45 43 47 36
Change the label? Use labels
(inside cut()
):
table(cut(a, breaks = c(0,2,4,6,8,10), labels = c("1-2","3-4","5-6","7-8","9-10")))
1-2 3-4 5-6 7-8 9-10
29 45 43 47 36
But you must make sure a
is numerical. You will get error if:
a <- factor(a)
table(cut(a, breaks = c(0,2,4,6,8,10)))
Error in cut.default(a, breaks = c(0, 2, 4, 6, 8, 10)) :
'x' must be numeric
Upvotes: 6