Bernhard
Bernhard

Reputation: 4417

R t.test() with data.frames as arguments

There is a question on CrossValidated, where someone gave two dataframes instead of two vectors to the t.test function: https://stats.stackexchange.com/questions/261830/t-test-or-wilcox-in-r-and-how-to-apply-to-dataframe-splitted-in-2-groups/

See this code for a shorter example

a <- data.frame(foo=1:5, bar=5:9)
b <- data.frame(foo=1:5, bar=5:9)
t.test(a,b)

The help page for the t.test function clearly states that x and y should be

a (non-empty) numeric vector of data values.

Still the above code throws no error but gives a result. What is the meaning of the result?

Upvotes: 2

Views: 410

Answers (2)

Eric Lecoutre
Eric Lecoutre

Reputation: 1481

You can have a look at the code inside :

 stats:::t.test.default

I showned here some selected pieces of code

function (x, y = NULL, alternative = c("two.sided", "less", "greater"), 
    mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, 
    ...) 
{
    alternative <- match.arg(alternative)
    if (!missing(mu) && (length(mu) != 1 || is.na(mu))) 
    ### snip
    if (!is.null(y)) {
    ### snip
       yok <- !is.na(y)
       xok <- !is.na(x)
    ### snip
      y <- y[yok]

So we do have a y argument and you will see that yok will lead to a selection that will be turned into a vector when used in y[yok]. Finally everything will happens on data.frame coerced to vectors (as.vector).

Definitively not what one would intend to do but also miss-specifications from user...

Upvotes: 3

Roland
Roland

Reputation: 132706

This is undocumented behavior, but you go against documentation when passing data.frames.

This happens:

x <- a
y <- b
yok <- !is.na(y)
xok <- !is.na(x)
y <- y[yok]
#[1] 1 2 3 4 5 5 6 7 8 9
x <- x[yok]
#[1] 1 2 3 4 5 5 6 7 8 9

Basically, you get the same result as if you did t.test(unlist(a), unlist(b)).

Upvotes: 2

Related Questions