Ram6
Ram6

Reputation: 67

Running multiple "PAIRED" t.tests to compare pairs of column values in a data frame in R

I have a data frame of many numeric variables for 60 participants. each participant has two values of each variable (before intervention and during intervention). I'd like to run paired t.test on each variable in this data frame

####data frame look like

Log.Name     fat     protein    carbs 
before R     19      32         134   
during R     21      43         167    
before R     32      14         322
during R     25      32         213
before R     42      34         201  
during R     34      23         305

I tried different approaches

qw<- matrix(lapply(names(new.averages)[-1], function(x){
  t.test(new.averages[new.averages$Log.Name =="before R", x], 
         new.averages[new.averages$Log.Name=="during R", x], mu=0, alt="two.sided", paired = F)$p.value}))

this didn't work but if I change paired to be False, it works !! but if Paired=True it throughs the following error

( Error in t.test.default(new.averages[new.averages$Log.Name == "before R", : not enough 'x' observations )

lapply(new.averages[-1], function(x) t.test(x ~ new.averages$Log.Name, paired=F)$p.value)

this one also works when paired=F but when paired=F, it throughs the following error

Error in complete.cases(x, y) : not all arguments have the same length

when I run individuals paired t.test it works, but then I will spend hours doing many tests while I should do it by one click!!

any idea?

Upvotes: 1

Views: 784

Answers (2)

Chuck P
Chuck P

Reputation: 3923

You can use the formula interface and then lapply or map

library(purrr)

# first a single case
t.test(fat ~ Log.Name, data = df,  paired = TRUE)
#> 
#>  Paired t-test
#> 
#> data:  fat by Log.Name
#> t = 1.3628, df = 2, p-value = 0.3061
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -9.34823 18.01490
#> sample estimates:
#> mean of the differences 
#>                4.333333

# then build a named vector of all the variables you want to test

tobetested <- names(df[-1])
names(tobetested) <- names(df[-1])

# you can use paste to build the formula on the fly

map(tobetested, 
    ~ t.test(as.formula(paste(.,  "~ Log.Name")), 
             data = df,  
             paired = TRUE))
#> $fat
#> 
#>  Paired t-test
#> 
#> data:  fat by Log.Name
#> t = 1.3628, df = 2, p-value = 0.3061
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -9.34823 18.01490
#> sample estimates:
#> mean of the differences 
#>                4.333333 
#> 
#> 
#> $protein
#> 
#>  Paired t-test
#> 
#> data:  protein by Log.Name
#> t = -0.68674, df = 2, p-value = 0.5632
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -43.59182  31.59182
#> sample estimates:
#> mean of the differences 
#>                      -6 
#> 
#> 
#> $carbs
#> 
#>  Paired t-test
#> 
#> data:  carbs by Log.Name
#> t = -0.14906, df = 2, p-value = 0.8952
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -278.7487  260.0821
#> sample estimates:
#> mean of the differences 
#>               -9.333333

Your data

library(readr)

df <- read_table("Log.Name     fat     protein    carbs
before R     19      32         134
during R     21      43         167
before R     32      14         322
during R     25      32         213
before R     42      34         201
during R     34      23         305")

Upvotes: 0

Allan Cameron
Allan Cameron

Reputation: 174478

You can pull each column of interest out of the data frame and compare the elements with an odd index to those with an even index if that's how your data are laid out:

lapply(new.averages[-1], function(x) {
 t.test(x[seq_along(x) %% 2 == 1], 
        x[seq_along(x) %% 2 == 0], paired = TRUE)$p.value
})

#> $fat
#> [1] 0.3061113
#> 
#> $protein
#> [1] 0.5631788
#> 
#> $carbs
#> [1] 0.8951818

Upvotes: 1

Related Questions