user3537951
user3537951

Reputation: 59

All combinations of two-way tables

How can I generate all two way tables from a data frame in R?

some_data <- data.frame(replicate(100, base::sample(1:4, size = 50, replace = TRUE)))
combos <- combn(names(some_data), 2)

The following does not work, was planning to wrap a for loop around it and store results from each iteration somewhere

i=1
table(some_data[combos[, i][1]], some_data[combos[, i][2]])

Why does this not work? individual arguments evaluate as expected:

some_data[combos[, i][1]]
some_data[combos[, i][2]]

Calling it with the variable names directly yields the desired result, but how to loop through all combos in this structure?

table(some_data$X1, some_data$X2)

Upvotes: 1

Views: 364

Answers (1)

akrun
akrun

Reputation: 887691

With combn, there is the FUN argument, so we can use that to extract the 'some_data' and then get the table output in an array

out <- combn(names(some_data), 2, FUN = function(i) table(some_data[i]))

Regarding the issue in the OP's post

table(some_data[combos[, i][1]], some_data[combos[, i][2]])

Both of them are data.frames, we can extract as a vector and it should work

table(some_data[, combos[, i][1]], some_data[, combos[, i][2]])
                ^^                           ^^

or more compactly

table(some_data[combos[, i]])

Update

combn by default have simplify = TRUE, that is it would convert the output to an array. Suppose, if we have combinations that are not symmetric, then this will result in different dimensions of the table output unless we convert it to factor with levels specified. An array can hold only a fixed dimensions. If some of the elements changes in dimension, it result in error as it is an array. One way is to use simplify = FALSE to return a list and list doesn't have that restriction.

Here is an example where the previous code fails

set.seed(24)
some_data2 <- data.frame(replicate(5, base::sample(1:10, size = 50, 
     replace = TRUE))) 
some_data <- data.frame(some_data, some_data2)

out1 <- combn(names(some_data), 2, FUN = function(i)
            table(some_data[i]), simplify = FALSE)

is.list(out1)
#[1] TRUE
length(out1)
#[1] 5460

Upvotes: 2

Related Questions