RmyjuloR
RmyjuloR

Reputation: 438

R table1 package: Adjusted p-values in extra column

Is there a way to apply a p-value adjustment on all p-values in the extra column? For by incorporating the p.adjust function somewhere? I am assuming that the pvalue function from the vignette is doing it's calculations per row, so it wouldn't make sense there. Is it possible anywhere in the arguments for the table1 function?

pvalue function from the vignette (adjusted for >2 groups):

pvalue <- function(x, ...) {
# Construct vectors of data y, and groups (strata) g
y <- unlist(x)
g <- factor(rep(1:length(x), times=sapply(x, length)))
if (is.numeric(y)) {
# For numeric variables, perform a standard 2-sample t-test
p <- anova(lm(y ~ g))$`Pr(>F)`[1]
} else {
# For categorical variables, perform a chi-squared test of independence
p <- chisq.test(table(y, g))$p.value
}
# Format the p-value, using an HTML entity for the less-than sign.
# The initial empty string places the output on the line below the variable label.
c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}

table1 example from the vignette (dataset is replaced with the iris dataset to make it more widely accessible:

table1(~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width | Species,
data=iris, overall=F, extra.col=list(`P-value`=pvalue))

As far as I understand the code above, you can not use the p.adjust function (stats R) in the pvalue function, as these are calculations on individual rows of the table. So could you specify it in the extra.col attribute to the table1 function? extra.col=p.adjust(list(`P-value`=pvalue))) or extra.col=list(`P-value`=p.adjust(pvalue))) gives the error:

Error in p.adjust(list(P-value = pvalue)) : 'list' object cannot be coerced to type 'double'

Link: https://cran.r-project.org/web/packages/table1/vignettes/table1-examples.html#example-a-column-of-p-values

Upvotes: 1

Views: 138

Answers (1)

the-mad-statter
the-mad-statter

Reputation: 8676

You are correct that pvalue() is being iteratively called. Because some corrections, such as "holm" depend on knowing all p-values in advance, you will need to post-process the table1() output in those instances. Otherwise you could do the adjustment in pvalue() a priori. However, post-processing will work regardless of chosen correction method.

Post hoc Solution

Assuming you are targeting html, here is a post-processing solution using {rvest}:

library(table1)
library(rvest)
library(stringr)

set.seed(42)

# example data with better spread of p-values
data <- data.frame(
  x = factor(rep(paste("Group", 1:2), each = 10)),
  y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
  y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
  y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)

# modified pvalue function to return raw p-values
pvalue <- function(x, ...) {
  # Construct vectors of data y, and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x), times = sapply(x, length)))
  if (is.numeric(y)) {
    # For numeric variables, perform a standard 2-sample t-test
    p <- anova(lm(y ~ g))$`Pr(>F)`[1]
  } else {
    # For categorical variables, perform a chi-squared test of independence
    p <- chisq.test(table(y, g))$p.value
  }
  # Format the p-value, using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("", p) # c("", sub("<", "<", format.pval(p, digits=3, eps=0.001)))
}

my_table1 <- table1(
  ~ . | x,
  data = data,
  overall = FALSE,
  extra.col = list(`P-value` = pvalue)
)

# extract raw p-values from table
raw_p <- my_table1[1] %>%
  read_html() %>%
  html_elements("body > table > tbody > tr > td:nth-child(4)") %>% # p-value is 4th column in example
  as.character() %>%
  str_extract("\\d+\\.\\d+e?-?\\d*")

# format the p-values with experiment-wise adjustment
new_p <- raw_p %>%
  as.numeric() %>%
  p.adjust(n = sum(!is.na(.))) %>%
  format.pval(digits = 3, eps = 0.001)

# replace raw with formatted p-values
for (i in seq_along(raw_p)) {
  if (!is.na(raw_p[i]))
    my_table1[1] <- my_table1[1] %>% str_replace(raw_p[i], new_p[i])
}

# table as desired
my_table1

Created on 2024-09-17 with reprex v2.1.1

A priori Solution

If you were using a correction that does not require knowing the p-values in advance such as "bonferroni", you could perform the adjustment in pvalue().

set.seed(42)

library(table1)

# example data with better spread of p-values
data <- data.frame(
  x = factor(rep(paste("Group", 1:2), each = 10)),
  y1 = c(rnorm(10, -0.7), rnorm(10, 0.7)),
  y2 = c(rnorm(10, -0.9), rnorm(10, 0.9)),
  y3 = c(rnorm(10, -1.2), rnorm(10, 1.2))
)

# modified pvalue function that performs correction a priori
pvalue <- function(x, ...) {
  # Construct vectors of data y, and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x), times = sapply(x, length)))
  if (is.numeric(y)) {
    # For numeric variables, perform a standard 2-sample t-test
    p <- anova(lm(y ~ g))$`Pr(>F)`[1]
  } else {
    # For categorical variables, perform a chi-squared test of independence
    p <- chisq.test(table(y, g))$p.value
  }
  # Format the p-value, using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("", sub("<", "<", format.pval(p.adjust(p, method = "bonferroni", n = 3), digits=3, eps=0.001)))
}

table1(
  ~ . | x,
  data = data,
  overall = FALSE,
  extra.col = list(`P-value` =  pvalue)
)

Created on 2024-09-17 with reprex v2.1.1

Upvotes: 0

Related Questions