Lee Drown
Lee Drown

Reputation: 45

T-test on two columns in R

I am trying to do a t-test to see if the values in two columns on two dfs are statistically different.

I am trying to run a code that compares the "Duration" column in two dfs -- "Tokens" and "Tokens.Single". Both dfs have the same number of values in their respective duration columns.

Here is the code I am trying:

# T-test for duration.
t.test(Tokens$Duration ~ Tokens.Single$Duration, paired=FALSE, var.equal=TRUE)

And this is the error message I received:

Error in t.test.formula(Tokens$Duration ~ Tokens.Single$Duration, paired = FALSE,  : 
  grouping factor must have exactly 2 levels

Any insight is appreciated!

Upvotes: 0

Views: 4531

Answers (1)

MDEWITT
MDEWITT

Reputation: 2368

Without a peak at your data, it's hard to say, but the syntax you are using in t.test is usually for response by factor variable.

Based on your description of your data you would be better to use the following syntax:

y <- rnorm(50)
x <- rnorm(50)

t.test(x,y)

Which will result in a comparison of means between x and y numeric vector, or in your case:

t.test(Tokens$Duration , Tokens.Single$Duration, paired=FALSE, var.equal=TRUE)

Just for completeness, if you had a factor variable indicating the run or experiment number, you could use the formula syntax e.g.

y <- rnorm(50)
z <- rep(c("A","B"), 25)
t.test(y ~z)

Yielding:

data:  y by z
t = -2.0418, df = 47.504, p-value = 0.04675
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.07859422 -0.00814587
sample estimates:
mean in group A mean in group B 
      0.1162672       0.6596372 

Upvotes: 3

Related Questions