user2846211
user2846211

Reputation: 969

Linear mixed model - list error

I'm trying to implement a linear mixed model using the following code. is.ill is a binary response variable and I'm trying to predict this for each gene in my dataset.

for (gene in 1:122) {
nlme.m1 <- lme(is.ill ~ pulse + age + gender + data[,gene], random = ~1|pulse)
lmm_analysis = summary(nlme.m1)
}

However this code fails right now due to the error:

Error in model.frame.default(formula = ~is.ill + pulse +  : 
  invalid type (list) for variable 'data'

data is 331 x 122 but I can't see what the problem is?

Here's the output of:

str(data)
'data.frame':   331 obs. of  122 variables:
 $ A0000004: num  0.616 0.543 0.486 0.326 0.378 ...
 $ A0000005: num  1.96 2.05 1.98 1.94 2 ...
 $ A0000007: num  0.659 0.588 0.769 0.628 0.464 ...
 $ A0000002: num  0.1761 0.0864 0.1492 0.1673 0.2455 ...
 $ A0000008: num  0.185 0.117 0.79 0.117 0.117 ...
 $ A0000010: num  0.143 0.49 0.49 0.587 0.243 ...
 $ A0000011: num  2.2 1.5 1.34 1.27 1.77 ...
 $ A0000013: num  -0.25 -0.304 -0.467 -0.372 -0.544 ...
 $ A0000018: num  2.16 2.06 1.97 1.99 1.87 ...
 $ A0000021: num  2.19 2.27 2.28 2.26 2.48 ...
 $ A0000022: num  1.068 0.958 1.049 0.966 1.061 ...
 $ A0000023: num  0.3945 0.0414 0.1072 0.0607 0.1367 ...
 $ A0000025: num  1.111 1.111 1.401 0.831 0.799 ...
 $ A0000027: num  1.44 1.36 1.47 1.29 1.56 ...
 $ A0000030: num  0.0569 -0.1107 0.0755 -0.0964 0.1875 ...
 $ A0000032: num  0.155 0.267 0.885 0.713 0.769 ...
 $ A0000035: num  0.0828 -0.2596 -0.068 -0.1244 0.0607 ...
 $ A0000036: num  0.4654 0.0294 0.1139 -0.1379 0.4281 ...
 $ A0000038: num  -0.762 -0.595 -0.332 -0.374 -0.165 ...
 $ A0000039: num  0.415 1.955 1.314 2.049 0.415 ...
 $ A0000040: num  -1.248 -0.73 -0.62 -0.662 -0.917 ...
 $ A0000043: num  2.08 1.99 2.07 2.12 2.16 ...
 $ A0000045: num  0.999 -0.833 -0.699 -0.907 -0.327 ...
 $ A0000046: num  0.615 0.592 0.674 0.539 0.58 ...
 $ A0000048: num  2.32 2.04 2.46 2.02 2.36 ...
 $ A0000050: num  0.4409 0.2625 0.3139 0.0607 0.238 ...
 $ A0000052: num  1.71 1.66 1.94 1.8 1.92 ...
 $ A0000053: num  2.24 2.08 2.08 2.03 2.16 ...
 $ A0000059: num  -0.804 -0.804 -0.804 -0.804 -0.804 ...
 $ A0000060: num  2.48 2.69 2.67 2.55 2.57 ...
 $ A0000061: num  0.712 0.539 0.697 0.624 0.79 ...
 $ A0000062: num  1.52 1.34 1.29 1.23 1.26 ...
 $ A0000064: num  0.961 1.654 1.425 1.477 2.121 ...
 $ A0000065: num  1.61 1.94 1.8 1.47 1.61 ...
 $ A0000067: num  -1.85 -1.93 -2.04 -1.93 -2.24 ...
 $ A0000068: num  -0.64 -0.352 -0.228 -0.147 -0.314 ...
 $ A0000069: num  1.146 0.822 1.265 1.173 1.021 ...
 $ A0000070: num  -0.58 -0.839 0.173 0.983 -0.475 ...
 $ A0000074: num  2.92 2.53 3.26 2.55 2.78 ...
 $ A0000075: num  -0.0306 0.0492 0.1139 0.0492 0.2201 ...
 $ A0000076: num  -0.368 -0.62 -0.194 -0.387 0.1 ...
 $ A0000077: num  1.83 1.87 1.63 1.83 3.13 ...
 $ A0000078: num  0.863 0.604 0.814 0.362 0.819 ...
 $ A0000079: num  0.477 0.204 0.554 0.767 0.617 ...
 $ A0000082: num  2.1 2.04 2.05 2.1 2.18 ...
 $ A0000084: num  0.3598 0.3243 0.1553 0.2553 0.0453 ...
 $ A0000085: num  2.33 2.3 1.65 2.24 1.14 ...
 $ A0000086: num  1.62 1.65 1.53 1.78 1.21 ...
 $ A0000088: num  0.276 0.236 0.19 0.124 0.204 ...
 $ A0000091: num  1.12 1.03 1.31 1.12 1.16 ...
 $ A0000094: num  1.73 1.84 1.8 1.73 1.69 ...
 $ A0000095: num  1.72 2.05 2.01 2.02 1.77 ...
 $ A0000096: num  -0.0395 -0.1463 -0.1062 -0.1463 -0.1931 ...
 $ A0000097: num  2.82 2.38 2.48 2.48 2.4 ...
 $ A0000100: num  2.42 2.18 2.16 2.21 2.39 ...

etc

Upvotes: 1

Views: 693

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226881

In general R 'prefers' it if all the data for a model are included within the same data frame. Different packages vary in how fussy they are about this -- I think your approach might work with lm, for example -- but it looks like lme is a bit fussier. Here's an approach that might work:

Make up some testing data:

set.seed(101)
y <- rnorm(100)
x <- runif(100)
f <- gl(10,10)
z <- data.frame(zz1=rnorm(100),zz2=rnorm(100)) ## equivalent to your 'data'

Combine all the data into a single data frame:

d <- data.frame(x,y,f,z)

Construct base model: nothing extra

m0 <- lme(y~x,random=~1|f,data=d)

Put together a formula that looks like ~ . + NEWNAME, and use it to update the model:

n <- names(z)[1]
m1 <- update(m0,reformulate(c(".",n),response="."))

Apply this approach to all of the columns (genes)

fitList <- lapply(names(z),
                  function(n) update(m0,reformulate(c(".",n),response=".")))

I'm concerned about a couple of other things with your model specification:

  • is.ill sounds like a binary response; have you considered fitting a generalized linear mixed model (with glmer from the lme4 package)?
  • in general having the same variable (pulse in this case) included in the model as both a fixed effect and a grouping variable is a bad idea, although there is an exception where (1) the variable is continuous and (2) there are repeated measurements for many observed values of the variable (in this case the fixed effect will explain linear trend and the random effect will explain random variation around the line of groups of observations).

Upvotes: 1

Related Questions