Reputation: 4692
If there a function that will give me both counts and column/overall percents in the same table? I can looked at both tables and reshape2 and don't see an option for doing it. I'll give a little example:
n <- 100
x <- sample(letters[1:3], n, T)
y <- sample(letters[1:3], n, T)
d <- data.frame(x=x, y=y)
This is very clunky as it requires me to unlist and recombine.
> library(tables)
> (t1 <- tabular(x~y*(n=length), d))
a b c
x n n n
a 13 14 11
b 8 11 13
c 10 12 8
> prop.table(matrix(unlist(t1),3,3), 1)
[,1] [,2] [,3]
[1,] 0.3421053 0.3684211 0.2894737
[2,] 0.2500000 0.3437500 0.4062500
[3,] 0.3333333 0.4000000 0.2666667
This is a little easier, but still not in one.
> library(reshape2)
> (t2 <- acast(d, x~y, length))
Using y as value column: use value_var to override.
a b c
a 13 14 11
b 8 11 13
c 10 12 8
> (t3 <- prop.table(t2,1))
a b c
a 0.3421053 0.3684211 0.2894737
b 0.2500000 0.3437500 0.4062500
c 0.3333333 0.4000000 0.2666667
What I really want is output that looks something like this:
> structure(list(
+ a = data.frame(n=t2[,1], pct=t3[,1]),
+ b = data.frame(n=t2[,2], pct=t3[,2]),
+ c = data.frame(n=t2[,3], pct=t3[,3])),
+ class = 'data.frame',
+ row.names = letters[1:3])
a.n a.pct b.n b.pct c.n c.pct
a 13 0.3421053 14 0.3684211 11 0.2894737
b 8 0.2500000 11 0.3437500 13 0.4062500
c 10 0.3333333 12 0.4000000 8 0.2666667
Is there a way to do this easily with R?
Upvotes: 8
Views: 7434
Reputation: 263362
tbl <- with(d, table(x,y) )
pct.tbl <- prop.table(tbl)
colnames(pct.tbl) <- paste("pct",colnames(pct.tbl), sep=".")
# The next line constructs an interleaving index to rearrange the columns
cbind(tbl, pct.tbl)[, c( matrix(1:(2*ncol(tbl)), nrow=2, byrow=TRUE) )]
#------
a pct.a b pct.b c pct.c
a 11 0.11 10 0.10 12 0.12
b 6 0.06 11 0.11 11 0.11
c 12 0.12 11 0.11 16 0.16
Another way to do the interleaving is to use c
to straihgten out a transposed matrix sequence
c( t( matrix(1:(2*ncol(tbl)), ncol=2) ) )
#[1] 1 4 2 5 3 6
And if you wanted those proportions to be column percents then just stick a 2
after the 'tbl' argument in the prop.table
call:
prop.table(tbl,2)
#----------
y
x a b c
a 0.3793103 0.3125000 0.3076923
b 0.2068966 0.3437500 0.2820513
c 0.4137931 0.3437500 0.4102564
Upvotes: 3
Reputation: 49640
Here is one approach, you still need a second step, but it comes before the tabular
command so the result is still a tabular
object.
n <- 100
x <- sample(letters[1:3], n, T)
y <- sample(letters[1:3], n, T)
d <- data.frame(x=x, y=y)
d$z <- 1/ave( rep(1,n), d$x, FUN=sum )
(t1 <- tabular(x~y*Heading()*z*((n=length) + (p=sum)), d))
Upvotes: 3
Reputation: 23898
Use CrossTable function from gmodles package.
library(gmodels)
Check the arguments of CrossTable
args(CrossTable)
function (x, y, digits = 3, max.width = 5, expected = FALSE,
prop.r = TRUE, prop.c = TRUE, prop.t = TRUE, prop.chisq = TRUE,
chisq = FALSE, fisher = FALSE, mcnemar = FALSE, resid = FALSE,
sresid = FALSE, asresid = FALSE, missing.include = FALSE,
format = c("SAS", "SPSS"), dnn = NULL, ...)
NULL
Apply CrossTable
CrossTable(x=d$x, y=d$y)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 100
| d$y
d$x | a | b | c | Row Total |
-------------|-----------|-----------|-----------|-----------|
a | 13 | 12 | 8 | 33 |
| 0.182 | 0.306 | 0.924 | |
| 0.394 | 0.364 | 0.242 | 0.330 |
| 0.371 | 0.387 | 0.235 | |
| 0.130 | 0.120 | 0.080 | |
-------------|-----------|-----------|-----------|-----------|
b | 13 | 11 | 18 | 42 |
| 0.197 | 0.313 | 0.969 | |
| 0.310 | 0.262 | 0.429 | 0.420 |
| 0.371 | 0.355 | 0.529 | |
| 0.130 | 0.110 | 0.180 | |
-------------|-----------|-----------|-----------|-----------|
c | 9 | 8 | 8 | 25 |
| 0.007 | 0.008 | 0.029 | |
| 0.360 | 0.320 | 0.320 | 0.250 |
| 0.257 | 0.258 | 0.235 | |
| 0.090 | 0.080 | 0.080 | |
-------------|-----------|-----------|-----------|-----------|
Column Total | 35 | 31 | 34 | 100 |
| 0.350 | 0.310 | 0.340 | |
-------------|-----------|-----------|-----------|-----------|
Upvotes: 3