Reputation: 21
I'm trying to find a function similar to SAS's Proc Means which will aggregate data and subtotal/total it by groups.
For example I have:
Var1 Var2
a b
a c
b b
and I want to create:
Var1 Var2 N
a b 1
a c 1
b b 1
na b 2
na c 1
a na 2
b na 1
na na 3
I've found that Summarise() and Aggregate() are able to do this but without the subtotals. There is also a Cube() function in development for data.table which does this, but I can only download from CRAN due to our IT security policy.
As you can probably tell, I'm new to R so I'm sorry if this is a fairly simple question.
Thanks!
Upvotes: 1
Views: 2137
Reputation: 269481
Using DF
in the Note at the end try this one-liner. The same code works if there are a different number of columns. Also try it without the as.data.frame
for wide format. No packages are used.
as.data.frame(addmargins(xtabs(~., DF)))
giving:
Var1 Var2 Freq
1 a b 1
2 b b 1
3 Sum b 2
4 a c 1
5 b c 0
6 Sum c 1
7 a Sum 2
8 b Sum 1
9 Sum Sum 3
DF
in reproducible form is:
DF <- structure(list(Var1 = structure(c(1L, 1L, 2L), .Label = c("a",
"b"), class = "factor"), Var2 = structure(c(1L, 2L, 1L), .Label = c("b",
"c"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
Upvotes: 5
Reputation: 10761
Here's a way you could do this, using bind_rows
and count
from dplyr
.
library(dplyr)
dat %>% count(Var1, Var2) %>% # count by Var1 and Var2
bind_rows(dat %>% count(Var1)) %>% # count by Var1
bind_rows(dat %>% count(Var2)) %>% # count by Var2
bind_rows(dat %>% count) # count rows
Var1 Var2 n
<chr> <chr> <int>
1 a b 1
2 a c 1
3 b b 1
4 a NA 2
5 b NA 1
6 NA b 2
7 NA c 1
8 NA NA 3
dat <- read.table(text = "Var1 Var2
a b
a c
b b", stringsAsFactors = FALSE, header = TRUE)
Upvotes: 1