Andrew Redd
Andrew Redd

Reputation: 4692

Getting both column counts and proportions in the same table in R

If there a function that will give me both counts and column/overall percents in the same table? I can looked at both tables and reshape2 and don't see an option for doing it. I'll give a little example:

data setup

n <- 100
x <- sample(letters[1:3], n, T)
y <- sample(letters[1:3], n, T)
d <- data.frame(x=x, y=y)

With tables

This is very clunky as it requires me to unlist and recombine.

> library(tables)
> (t1 <- tabular(x~y*(n=length), d))

   a  b  c 
 x n  n  n 
 a 13 14 11
 b  8 11 13
 c 10 12  8
> prop.table(matrix(unlist(t1),3,3), 1)
          [,1]      [,2]      [,3]
[1,] 0.3421053 0.3684211 0.2894737
[2,] 0.2500000 0.3437500 0.4062500
[3,] 0.3333333 0.4000000 0.2666667

With Reshape2

This is a little easier, but still not in one.

> library(reshape2)
> (t2 <- acast(d, x~y, length))
Using y as value column: use value_var to override.
   a  b  c
a 13 14 11
b  8 11 13
c 10 12  8
> (t3 <- prop.table(t2,1))
          a         b         c
a 0.3421053 0.3684211 0.2894737
b 0.2500000 0.3437500 0.4062500
c 0.3333333 0.4000000 0.2666667

Desired output

What I really want is output that looks something like this:

> structure(list(
+     a = data.frame(n=t2[,1], pct=t3[,1]),
+     b = data.frame(n=t2[,2], pct=t3[,2]),
+     c = data.frame(n=t2[,3], pct=t3[,3])), 
+   class = 'data.frame',
+   row.names = letters[1:3])
  a.n     a.pct b.n     b.pct c.n     c.pct
a  13 0.3421053  14 0.3684211  11 0.2894737
b   8 0.2500000  11 0.3437500  13 0.4062500
c  10 0.3333333  12 0.4000000   8 0.2666667

Is there a way to do this easily with R?

Upvotes: 8

Views: 7434

Answers (3)

IRTFM
IRTFM

Reputation: 263362

tbl <- with(d, table(x,y)  )
 pct.tbl <- prop.table(tbl)
 colnames(pct.tbl) <- paste("pct",colnames(pct.tbl), sep=".") 
# The next line constructs an interleaving index to rearrange the columns
 cbind(tbl, pct.tbl)[, c( matrix(1:(2*ncol(tbl)), nrow=2, byrow=TRUE) )]
#------
   a pct.a  b pct.b  c pct.c
a 11  0.11 10  0.10 12  0.12
b  6  0.06 11  0.11 11  0.11
c 12  0.12 11  0.11 16  0.16

Another way to do the interleaving is to use c to straihgten out a transposed matrix sequence

c( t( matrix(1:(2*ncol(tbl)), ncol=2) ) )
#[1] 1 4 2 5 3 6

And if you wanted those proportions to be column percents then just stick a 2 after the 'tbl' argument in the prop.table call:

 prop.table(tbl,2)
 #----------
   y
x           a         b         c
  a 0.3793103 0.3125000 0.3076923
  b 0.2068966 0.3437500 0.2820513
  c 0.4137931 0.3437500 0.4102564

Upvotes: 3

Greg Snow
Greg Snow

Reputation: 49640

Here is one approach, you still need a second step, but it comes before the tabular command so the result is still a tabular object.

n <- 100 
x <- sample(letters[1:3], n, T) 
y <- sample(letters[1:3], n, T) 
d <- data.frame(x=x, y=y) 
d$z <- 1/ave( rep(1,n), d$x, FUN=sum )

(t1 <- tabular(x~y*Heading()*z*((n=length) + (p=sum)), d))

Upvotes: 3

MYaseen208
MYaseen208

Reputation: 23898

Use CrossTable function from gmodles package.

library(gmodels)

Check the arguments of CrossTable

args(CrossTable)
function (x, y, digits = 3, max.width = 5, expected = FALSE, 
    prop.r = TRUE, prop.c = TRUE, prop.t = TRUE, prop.chisq = TRUE, 
    chisq = FALSE, fisher = FALSE, mcnemar = FALSE, resid = FALSE, 
    sresid = FALSE, asresid = FALSE, missing.include = FALSE, 
    format = c("SAS", "SPSS"), dnn = NULL, ...) 
NULL

Apply CrossTable

CrossTable(x=d$x, y=d$y)



   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|


Total Observations in Table:  100 


             | d$y 
         d$x |         a |         b |         c | Row Total | 
-------------|-----------|-----------|-----------|-----------|
           a |        13 |        12 |         8 |        33 | 
             |     0.182 |     0.306 |     0.924 |           | 
             |     0.394 |     0.364 |     0.242 |     0.330 | 
             |     0.371 |     0.387 |     0.235 |           | 
             |     0.130 |     0.120 |     0.080 |           | 
-------------|-----------|-----------|-----------|-----------|
           b |        13 |        11 |        18 |        42 | 
             |     0.197 |     0.313 |     0.969 |           | 
             |     0.310 |     0.262 |     0.429 |     0.420 | 
             |     0.371 |     0.355 |     0.529 |           | 
             |     0.130 |     0.110 |     0.180 |           | 
-------------|-----------|-----------|-----------|-----------|
           c |         9 |         8 |         8 |        25 | 
             |     0.007 |     0.008 |     0.029 |           | 
             |     0.360 |     0.320 |     0.320 |     0.250 | 
             |     0.257 |     0.258 |     0.235 |           | 
             |     0.090 |     0.080 |     0.080 |           | 
-------------|-----------|-----------|-----------|-----------|
Column Total |        35 |        31 |        34 |       100 | 
             |     0.350 |     0.310 |     0.340 |           | 
-------------|-----------|-----------|-----------|-----------|

Upvotes: 3

Related Questions