Reputation: 27
I am trying to include Pooled Standard Error (PSE) and Eta square to tbl_summary()
. PSE is calculated by sqrt(mean(residuals^2)/n)
, I tried to calculate step by step by extracting the residuals from either aov()
or lm()
, but I got the error saying The dimension of respected variable and the added statistic do not match. Expecting statistic/dataframe to be length/ no. rows 1
. Here is my code:
PSE <- function(data, variable, by,...) {
aov(data[["variable"]] ~ as.factor(data[[by]]))$residuals
}
Dataset_TPA_Full %>%
select(diet,hardness_g,adhesiveness_g_sec, resilence, cohesion, springiness, gumminess, chewiness, firmness_g_force_1, density_g_l)%>%
tbl_summary(
by = diet,
statistic = all_continuous() ~ "{mean} ± {sem}",
label = list(hardness_g = "Hardness (g)",
adhesiveness_g_sec = "Adhesiveness (g/ sec)",
resilence = "Resilience",
cohesion = "Cohesion",
springiness = "Springiness",
gumminess = "Gumminess",
chewiness = "Chewiness",
firmness_g_force_1 = "Firmness (g)",
density_g_l = "Density (g/ L)")
) %>%
add_p(
test = all_continuous() ~ "aov",
) %>%
add_stat(fns = all_continuous() ~ PSE) %>%
modify_header(label = "**Treatment**", p.value = "**p-value**") %>%
bold_labels() %>%
bold_levels()
Also when I tried to add Eta squared using this code, it return missing data argument when I put it in the add_stat()
function
my_ES_test <- function(data, variable, by, ...) {
aovmod = aov(data[[variable]] ~ data[[by]])
lsr::etaSquared(aovmod)[1,1]
}
Can you help me with this? Thank you.
Upvotes: 1
Views: 536
Reputation: 21992
This should do it:
library(gtsummary)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
sem <- function(x){
sqrt(var(x, na.rm=TRUE)/sum(!is.na(x)))
}
PSE <- function(data, variable, by,...) {
e <- aov(data[[variable]] ~ as.factor(data[[by]]))$residuals
sqrt(mean(e^2)/length(e))
}
mtcars %>%
select(cyl, mpg, hp, disp, drat, qsec)%>%
tbl_summary(
by = cyl,
statistic = all_continuous() ~ "{mean} ± {sem}",
label = list(mpg = "Miles per Gallon",
hp = "Horsepower",
disp = "Displacement",
drat = "Rear Axel Ratio",
qsec = "1/4 Mile Time")
) %>%
add_p(
test = all_continuous() ~ "aov",
) %>%
add_stat(fns = all_continuous() ~ PSE) %>%
modify_header(label = "**Treatment**", p.value = "**p-value**", add_stat_1 = "**PSE**") %>%
bold_labels() %>%
bold_levels()
Created on 2022-04-17 by the reprex package (v2.0.1)
Note, the PSE()
function had two problems. First, the data[["variable"]]
should be data[[variable]]
(without the quotes around variable
). Second, you had the function return the residuals, not the PSE calculation you described in the question. Now, it returns the appropriate result. I also am not sure where you got the sem()
function, so I just made one that calculates the standard error of the mean.
PSE <- function(data, variable, by,...) {
s <- data %>%
group_by(!!sym(by)) %>%
summarise(s = var(!!sym(variable)),
n = n()) %>%
mutate(num = s*(n-1))
psd <- sqrt(sum(s$num)/(sum(s$n) - nrow(s)))
psd*sqrt(sum(1/s$n))
}
Upvotes: 3