Michelle
Michelle

Reputation: 1363

Passing correct variables to an lm function in R

I have some series of biochemical data to analyse by drug dose (3 levels) within sex so I used the function suggestion by Eduardo Leoni in this answer to a similar question to create a base lm function I can keep calling for each of the almost 20 biochemistry analytes I need to analyse for each of two drugs.

Obviously, x will be different for each drug and y will be different for each analyte. z will always be the same dataframe.

This function works fine:

 GrpReg <- function(x,y,z) {
   ## coef and se in a data frame
   mr <- data.frame(coef(summary(lm(y ~ x,data=z))))
   ## put row names (predictors/indep variables)
   mr$predictor <- rownames(mr)
   mr
   }

And it gave me the correct output when I tested it with the females:

 GrpReg(subset(MyData$DrugE, MyData$Sex=="F"), 
      subset(MyData$Triglycerides, MyData$Sex=="F"),subset(MyData, Sex=="F"))

But I can't get the function to work when I want it to output for each sex separately. First I tried:

 TriEch <- by(MyData, MyData$Sex, GrpReg(MyData$DrugE, MyData$Triglycerides))
  Show Traceback

  Rerun with Debug
  Error in summary(lm(y ~ x, data = z)) : 
   error in evaluating the argument 'object' in selecting a method for function 'summary': 
  Error in model.frame.default(formula = y ~ x, data = z, drop.unused.levels = TRUE) : 
   argument "z" is missing, with no default 

and so I modified the function contents to include all variables and then got this error:

 TriDrugE <- by(MyData, MyData$Sex, GrpReg(MyData$DrugE, MyData$Triglycerides, MyData))
 Error in FUN(X[[1L]], ...) : could not find function "FUN"

I'm having problems finding an example where three variables are being passed to a function that I can copy.

Here's some play data to use:

 MyData<- as.data.frame(c(rnorm(10, 0.5, 0.1),rnorm(10, 0.6, 0.15),rnorm(10, 0.5, 0.08),rnorm(10, 0.61, 0.15),rnorm(10, 0.45, 0.11),rnorm(10, 0.55, 0.12),
    rnorm(10, 0.45, 0.12), rnorm(10, 0.45, 0.15)))
 colnames(MyData)<- "Triglycerides"
 MyData$Sex <- c(rep("F",10),rep("M",10),rep("F",10),rep("M",10),rep("F",10),rep("M",10),rep("F",10),rep("M",10))
 MyData$DrugE <- rep(c(0, 2, 4, 8), each=20)

Upvotes: 1

Views: 742

Answers (1)

Bryan Hanson
Bryan Hanson

Reputation: 6223

You are over thinking a bit (easy to do). Separate out the subsetting, don't try to do it in in the actual function call. It is much easier to read it that way. And, you can take advantage of built-in R formula methods with a slight modification of your function. See if this is what you need:

GrpReg <- function(formula, data) {
   mr <- data.frame(coef(summary(lm(formula, data))))
   mr$predictor <- rownames(mr) # you don't really need this
   return(mr)
   }

males <- subset(MyData, Sex == "M") # Subset before calling function
male_results <- GrpReg(DrugE ~ Triglycerides, males)

To process several of these data sets, something like this might work:

resp <- paste("Drug", LETTERS[5:1], sep = "")
pred <- c("Triglycerides", "LDL", "HDL", "Cholesterol", "Potassium")
forms <- paste(resp, pred, sep = "~")

doAll <- function(forms, data) { # will fail on 2nd iteration (no data)
    for (i in 1:length(forms)) {
        tmp <- GrpReg(as.formula(forms[i]), males)
        print(tmp)
        }
    }

doAll(forms, males)

Upvotes: 1

Related Questions