Reputation: 1
I have a dataset with many columns. First column is the outcome (Test)(Dependent variable, y). Columns 2-32 are confounders. Finally, columns 33-54 are miRNAs (expression)(Independent variable, x).
I want to do a linear regression (to obtain p-value and estimate) between each one of the independent variables with the dependent variable.
I don't want to put all of them in the same model, I want different models, one by one. That is:
I have found this and I found it quite useful:
apply(df[-1], 2, function(x) summary(lm(x ~ df$Test))$coef[1,c(1,4)])
*Estimate -160.0660000 -382.2870000 136.4690000 106.9820000
Pr(>|t|) 0.6069965 0.3886881 0.7340981 0.7030296*
However, now I want to adjust my models by all the confounders (columns 2-32).
I tried adding apply(df[-1], 2, function(x) summary(lm(x ~ df$Test+confounders))$coef[1,c(1,4)]) But it doesn't work?
Any idea?
Thanks! :)
Upvotes: 0
Views: 728
Reputation: 1
I was trying to do some similar analysis, but this time with partial correlation. I modified your script to do correlation analysis, and I got it. But now I want to adjust the model by age, sex, and other confounders (the same as in the regression model).
As before, I want to repeat the same analysis for different miRNAs. That is:
I tried with the pcor.test function, for partial correlation. And I used method "spearman", since it doesn't follow a normal distribution.
I modified the script, but it doesn't work. Some help please? Thanks!
#data
n <- 10000
nc <- 30
nm <- 20
y <- rnorm(n = n)
X <- matrix(rnorm(n = n*(nc+nm)), ncol = nc + nm)
df <- data.frame(y = y, X)
#variable names
confounders <- colnames(df)[2:31]
mirnas <- colnames(df)[32:51]
#auxiliar regression function
pcor_fun <- function(data, y_col, X_cols) {
formula <- as.formula(paste(y_col, X_cols))
pcor <- pcor.test(formula = formula, data = data, method = "spearman")
pcor_summary <- summary(pcor)$coef
return(pcor_summary)
}
#simple linear regressions
lm_list1 <- lapply(X = mirnas, FUN = pcor_fun, data = df, y_col = "y")
lm_list1[[1]]
#adjusting by confounders
lm_list2 <- lapply(X = mirnas, FUN = function(x) pcor_fun(data = df, y_col = "y", X_cols = c(confounders, x)))
lm_list2[[1]]
Upvotes: 0
Reputation: 1021
I think a good approach would be to create an auxiliary function to get the results you want. This function depends on y
column as well as X
columns which can be a single string or a vector of strings:
# data
n <- 10000
nc <- 30
nm <- 20
y <- rnorm(n = n)
X <- matrix(rnorm(n = n*(nc+nm)), ncol = nc + nm)
df <- data.frame(y = y, X)
# variable names
confounders <- colnames(df)[2:31]
mirnas <- colnames(df)[32:51]
# auxiliar regression function
lm_fun <- function(data, y_col, X_cols) {
formula <- as.formula(paste(y_col, "~", paste(X_cols, collapse = "+")))
reg <- lm(formula = formula, data = data)
reg_summary <- summary(reg)$coef
return(reg_summary)
}
# simple linear regressions
lm_list1 <- lapply(X = mirnas, FUN = lm_fun, data = df, y_col = "y")
lm_list1[[1]]
# adjusting by confounders
lm_list2 <- lapply(X = mirnas, FUN = function(x) lm_fun(data = df, y_col = "y", X_cols = c(confounders, x)))
lm_list2[[1]]
Upvotes: 1