Reputation: 438
Is there a way to apply a p-value adjustment on all p-values in the extra column? For by incorporating the p.adjust function somewhere? I am assuming that the pvalue function from the vignette is doing it's calculations per row, so it wouldn't make sense there. Is it possible anywhere in the arguments for the table1 function?
pvalue
function from the vignette (adjusted for >2 groups):
pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times=sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}
table1 example from the vignette (dataset is replaced with the iris
dataset to make it more widely accessible:
table1(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width | Species,
data=iris, overall=F, extra.col=list(`P-value`=pvalue))
As far as I understand the code above, you can not use the p.adjust
function (stats R) in the pvalue
function, as these are calculations on individual rows of the table. So could you specify it in the extra.col
attribute to the table1
function? extra.col=p.adjust(list(`P-value`=pvalue)))
or extra.col=list(`P-value`=p.adjust(pvalue)))
gives the error:
Error in p.adjust(list(P-value = pvalue)) : 'list' object cannot be coerced to type 'double'
Upvotes: 1
Views: 138
Reputation: 8676
You are correct that pvalue()
is being iteratively called. Because some corrections, such as "holm" depend on knowing all p-values in advance, you will need to post-process the table1()
output in those instances. Otherwise you could do the adjustment in pvalue()
a priori. However, post-processing will work regardless of chosen correction method.
Assuming you are targeting html, here is a post-processing solution using {rvest}
:
library(table1)
library(rvest)
library(stringr)
set.seed(42)
# example data with better spread of p-values
data <- data.frame(
x = factor(rep(paste("Group", 1:2), each = 10)),
y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)
# modified pvalue function to return raw p-values
pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times = sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", p) # c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}
my_table1 <- table1(
~ . | x,
data = data,
overall = FALSE,
extra.col = list(`P-value` = pvalue)
)
# extract raw p-values from table
raw_p <- my_table1[1] %>%
read_html() %>%
html_elements("body > table > tbody > tr > td:nth-child(4)") %>% # p-value is 4th column in example
as.character() %>%
str_extract("\\d+\\.\\d+e?-?\\d*")
# format the p-values with experiment-wise adjustment
new_p <- raw_p %>%
as.numeric() %>%
p.adjust(n = sum(!is.na(.))) %>%
format.pval(digits = 3, eps = 0.001)
# replace raw with formatted p-values
for (i in seq_along(raw_p)) {
if (!is.na(raw_p[i]))
my_table1[1] <- my_table1[1] %>% str_replace(raw_p[i], new_p[i])
}
# table as desired
my_table1
Created on 2024-09-17 with reprex v2.1.1
If you were using a correction that does not require knowing the p-values in advance such as "bonferroni", you could perform the adjustment in pvalue()
.
set.seed(42)
library(table1)
# example data with better spread of p-values
data <- data.frame(
x = factor(rep(paste("Group", 1:2), each = 10)),
y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)
# modified pvalue function that performs correction a priori
pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times = sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", sub("<", "<", format.pval(p.adjust(p, method = "bonferroni", n = 3), digits=3, eps=0.001)))
}
table1(
~ . | x,
data = data,
overall = FALSE,
extra.col = list(`P-value` = pvalue)
)
Created on 2024-09-17 with reprex v2.1.1
Upvotes: 0