Reputation: 431
I want to do an unpaired t-test to examine if values differ between sites in each type category.
So my question is, within types (AB or CD), do values (valueA or valueB) differ between sites (A or B)?
Here is an example of my data:
dat <- data.frame(
"site" = c("A","B","B","A","A","B","B","A"),
"type" = c("AB","CD"),
"valueA" = c(13,-10,-5,18,-14,12,-17,19),
"valueB" = c(-3,20,15,-16,12,15,-11,14)
)
dat
site type valueA valueB
A AB 13 -3
B CD -10 20
B AB -5 15
A CD 18 -16
A AB -14 12
B CD 12 15
B AB -17 -11
A CD 19 14
I am trying to do four unpaired t-tests to examine:
In order to run the unpaired t-test, I believe I need to re-arrange my data so that type AB and type CB and site A and site B are each a column (instead of being within the type or site column).
EDIT:
Using the suggested code in the comments:
library(dplyr)
d %>%
group_by(site, type) %>%
summarise(pval = t.test(valueA, valueB)$p.value)
The output is this:
site type pval
A AB 0.784
A CD 0.417
B AB 0.492
B CD 0.365
To my understanding, this p-value here is giving me the difference between valueA and valueB.
I am looking for, for example: The difference between site A and site B of valueA in type CD.
So if I am thinking correctly, the output of the t-test should have a column for type, value A and value B. Then the p-values are for the differences between sites.
Similar to this:
type valueA valueB
AB 0.365 0.784
CD 0.492 0.417
Does this make sense?
Upvotes: 0
Views: 100
Reputation: 8120
I think I see what you're asking for. See if this works for you:
library(tidyverse)
dat %>%
pivot_longer(cols = c(valueA, valueB), names_to = "name", values_to = "val") %>%
split(.$site) %>%
map(., ~rename(.x, !!sym(paste0(.x$site[[1]], "val")) := val) %>%
select(-site)) %>%
reduce(full_join, by = c("type", "name")) %>%
group_by(type, name) %>%
summarise(p.val = t.test(Aval, Bval)$p.value) %>%
pivot_wider(id_cols = type, names_from = name, values_from = p.val)
#> # A tibble: 2 x 3
#> # Groups: type [2]
#> type valueA valueB
#> <fct> <dbl> <dbl>
#> 1 AB 0.284 0.785
#> 2 CD 0.0703 0.121
Here we go from wide to long, split the dataframe by site. Rename the values of interest to include the site, re-join the dataframe, and then run a grouped t.test by type and and site.
Upvotes: 1
Reputation: 887831
We can do a group_by
'site', 'type' and apply the t.test
library(dplyr)
out <- dat %>%
group_by(site, type) %>%
summarise(pval = t.test(valueA, valueB)$p.value)
By default, paired = FALSE
in t.test
The output above can be reshaped to 'wide' format with pivot_wider
library(stringr)
library(tidyr)
out %>%
ungroup %>%
mutate(site = str_c('value', site)) %>%
pivot_wider(names_from = site, values_from = pval)
# A tibble: 2 x 3
# type valueA valueB
# <fct> <dbl> <dbl>
#1 AB 0.784 0.492
#2 CD 0.417 0.365
If we want to compare the 'value' columns between 'AB' and 'CD'
dat %>%
group_by(site) %>%
summarise_at(vars(starts_with('value')),
~ t.test(.[type == 'AB'], .[type == 'CD'])$p.value)
# A tibble: 2 x 3
# site valueA valueB
# <fct> <dbl> <dbl>
#1 A 0.393 0.784
#2 B 0.464 0.439
Upvotes: 2