Reputation: 2385
My data is contained in a data.frame:
SYMBOL variable value Sample IDs Group
TLR8 MMRF_2613_1_BM 3.186233 Baseline 2613 LessUp
TLR8 MMRF_2613_1_BM 5.471014 Baseline 2613 LessUp
TLR8 MMRF_2613_1_BM 2.917965 Baseline 2613 MostUp
TLR8 MMRF_2613_1_BM 2.147028 Baseline 2613 MostUp
TLR4 MMRF_2613_1_BM 7.497424 Baseline 2613 LessUp
TLR4 MMRF_2613_1_BM 4.16523 Baseline 2613 LessUp
TLR4 MMRF_2613_1_BM 7.136523 Baseline 2613 MostUp
TLR4 MMRF_2613_1_BM 7.96523 Baseline 2613 MostUp
For each SYMBOL
, I would like to divide the sum of value
for the rows where Group
is "MostUp"
by the sum of value
for "LessUp"
rows.
I believe I could use the group_by
function, but I am not sure how to apply it correctly.
Here is an example of my expected output.
SYMBOL variable value Sample IDs Group
TLR8 MMRF_2613_1_BM 0.58 Baseline 2613 MostUp_divided_by_LessUp
TLR4 MMRF_2613_1_BM 1.29 Baseline 2613 MostUp_divided_by_LessUp
In addition to calculating the ratios, how would I perform a T-test between the groups?
Upvotes: 0
Views: 101
Reputation: 388982
We could first calculate the sum of each Group
for each Symbol
and then divide within each other based on value of 'MostUp'
and 'LessUp'
.
library(dplyr)
df %>%
group_by(SYMBOL, variable, Sample, IDs, Group) %>%
summarise(value = sum(value)) %>%
summarise(value = value[Group == 'MostUp']/value[Group == 'LessUp'])
# SYMBOL variable Sample IDs value
# <fct> <fct> <fct> <int> <dbl>
#1 TLR4 MMRF_2613_1_BM Baseline 2613 1.29
#2 TLR8 MMRF_2613_1_BM Baseline 2613 0.585
To calculate t.test
between groups we can do :
df1 <- df %>%
group_by(SYMBOL, variable, Sample, IDs) %>%
summarise(value = list(t.test(value[Group == 'MostUp'],
value[Group == 'LessUp'])))
df1
# A tibble: 2 x 5
# Groups: SYMBOL, variable, Sample [2]
# SYMBOL variable Sample IDs value
# <fct> <fct> <fct> <int> <list>
#1 TLR4 MMRF_2613_1_BM Baseline 2613 <htest>
#2 TLR8 MMRF_2613_1_BM Baseline 2613 <htest>
data
df <- structure(list(SYMBOL = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L), .Label = c("TLR4", "TLR8"), class = "factor"), variable = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "MMRF_2613_1_BM", class = "factor"),
value = c(3.186233, 5.471014, 2.917965, 2.147028, 7.497424,
4.16523, 7.136523, 7.96523), Sample = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = "Baseline", class = "factor"),
IDs = c(2613L, 2613L, 2613L, 2613L, 2613L, 2613L, 2613L,
2613L), Group = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L
), .Label = c("LessUp", "MostUp"), class = "factor")),
class = "data.frame", row.names = c(NA, -8L))
Upvotes: 2