helpmeplease
helpmeplease

Reputation: 15

Loop a t-test through a list of data frames

I have a load of survey data that I need to run a t-test through. It looks something like this (but not much like this, a dolphin is unlikely to be 52mm):

Area                    Season  Species Length (mm)
Christchurch            Spring  dolphin 52
Christchurch            Spring  dolphin 54
Christchurch            Spring  dolphin 46
Christchurch            Spring  dolphin 40
Christchurch            Spring  dolphin 38
Christchurch            Autumn  dolphin 52
Christchurch            Autumn  dolphin 54
Christchurch            Autumn  dolphin 46
Christchurch            Autumn  dolphin 40
Christchurch            Autumn  dolphin 38
Christchurch            Spring  ray     52
Christchurch            Spring  ray     54
Christchurch            Spring  ray     46
Christchurch            Spring  ray     40
Christchurch            Spring  ray     38
Christchurch            Autumn  ray     52
Christchurch            Autumn  ray     54
Christchurch            Autumn  ray     46
Christchurch            Autumn  ray     40
Christchurch            Autumn  ray     38

My problem is I have a range of species and about 2000 measurements and I need to run a paired t-test for each species between each season. I am very new to r and coding in general so any help is appreciated in making this process more efficient as I am fully aware I have probably not gone about this the most streamlined way.

I'd like to be able to loop the t-test through somehow and get a nice understandable output and be able to apply the script to other locations easily (I have 6).

I have split the large data frame down to species and removed the empty data frames from the list

list_df<-split(ld22,ld22$SPECIES_NAME)
list_df<-list_df[sapply(list_df, nrow) > 0]

I then tried this, which I found by googling the problem:

p <-list()
for (i in 1:length(list_df)) {
  p[[i]] <- pairwise.t.test(list_df[[i]]$TOTAL_LENGTH_MM, list_df[[i]]$SURVEY_TYPE, p.adjust = "none")
}
p

There are no error codes but I don't get any results and I have no idea where to go next. Any help would be much appreciated.

Upvotes: 0

Views: 60

Answers (3)

chris jude
chris jude

Reputation: 498

Write a function and use map function. Can u dput(list_df) if this doesn't work?

library(magrittr)
library(tidyverse)
my_function<-function(df){
  df %$% pairwise.t.test(TOTAL_LENGTH_MM, SURVEY_TYPE, p.adjust = "none")
}
map(list_df,my_function)

Upvotes: 0

Julian
Julian

Reputation: 9330

Everything in one go using purrr:

library(purrr)
library(dplyr)
ld22  |> 
  group_split(Species) |> 
  setNames(unique(ld22 $Species)) |> 
  keep(~length(.x) > 0) |> 
  imap(~pairwise.t.test(x = .x$Length, g = .x$Season,p.adjust = "none") |> 
         broom::tidy() |> 
         mutate(species = .y))

Output:

$dolphin
# A tibble: 1 x 4
  group1 group2 p.value species
  <chr>  <chr>    <dbl> <chr>  
1 Spring Autumn       1 dolphin

$ray
# A tibble: 1 x 4
  group1 group2 p.value species
  <chr>  <chr>    <dbl> <chr>  
1 Spring Autumn       1 ray   

Upvotes: 1

harre
harre

Reputation: 7307

We could use lapply instead of the loop to make it a bit less verbose. We would probably want want to extract the p.value from the returned list too. I.e.

p <- 
  split(ld22, ld22$Species) |>
  lapply(\(x) pairwise.t.test(x$Length, x$Season, p.adjust = "none")$p.value)

Output:

$dolphin
       Autumn
Spring      1

$ray
       Autumn
Spring      1

Data:

library("readr")

ld22 <- read_table("Area                    Season  Species Length
Christchurch            Spring  dolphin 52
Christchurch            Spring  dolphin 54
Christchurch            Spring  dolphin 46
Christchurch            Spring  dolphin 40
Christchurch            Spring  dolphin 38
Christchurch            Autumn  dolphin 52
Christchurch            Autumn  dolphin 54
Christchurch            Autumn  dolphin 46
Christchurch            Autumn  dolphin 40
Christchurch            Autumn  dolphin 38
Christchurch            Spring  ray     52
Christchurch            Spring  ray     54
Christchurch            Spring  ray     46
Christchurch            Spring  ray     40
Christchurch            Spring  ray     38
Christchurch            Autumn  ray     52
Christchurch            Autumn  ray     54
Christchurch            Autumn  ray     46
Christchurch            Autumn  ray     40
Christchurch            Autumn  ray     38")

Update:

Or just use dplyr:

library(dplyr)

ld22 |>
  group_by(Species) |>
  summarise(p_value = pairwise.t.test(Length, Season, p.adjust = "none")$p.value) |>
  ungroup()

Output:

# A tibble: 2 × 2
  Species p_value[,1]
  <chr>         <dbl>
1 dolphin           1
2 ray               1

Upvotes: 1

Related Questions