Reputation: 320
I am trying to extract the parameters of a model I have designed for my work, but am having trouble doing so. I will use the iris data provided with R as an example data set.
I pull in the data:
library(dplyr)
df <- iris
I design a model that takes Sepal Width and Length, and determines what the best model fit given Sepal Length is to determine Sepal Width (this is probably a bad example of a data set, and I didn't pay much attention to the starting parameters, but I am almost certain this shouldn't affect how hard it is to answer this question). One model is produced for each Species:
sepalmodel <- df %>%
group_by(Species) %>%
do(model = nls(Sepal.Width ~ a*exp(Sepal.Length*b), start = list(a = 0.5, b = 0.9), data = .)) %>%
ungroup()
This produces a dataframe with one column being Species, and the other being model. In the model column, each model is represented in each row by a large list, which I have trouble extracting its basic data from. I would like to work with the parameters directly, as a more clean and reproducible way of representing and generating data with these models. How can I extract the a and b parameters from these models? Is there a function that does so, or do I need to dig into the code of each model? Also, is there any data built into that list that associates it with the specific species, or is it only directly associated with the species by nature of being found in the same row?
Upvotes: 2
Views: 2305
Reputation: 270268
Here are two approaches.
1) nlme This package comes with R so it does not have to be installed. The formula can be specified as usual followed by a vertical bar and the grouping variable allowing all models to be specified at once.
library(nlme)
fm <- nlsList(Sepal.Width ~ a * exp(Sepal.Length*b) | Species, data = iris,
start = list(a = 0.5, b = 0.5))
co <- coef(summary(fm))
co
giving this array so that, for example, co["setosa",,], co[, "Estimate", ], co[,,"a"] give matrices for setosa, for all estimates and for all "a" coefficients. The entire array is:
, , a
Estimate Std. Error t value Pr(>|t|)
setosa 1.068204 0.1729358 6.176882 3.226364e-08
versicolor 1.412344 0.2302122 6.134967 1.314458e-07
virginica 1.795394 0.2453925 7.316418 1.170751e-08
, , b
Estimate Std. Error t value Pr(>|t|)
setosa 0.23226126 0.03189911 7.281121 5.122595e-10
versicolor 0.11319887 0.02708865 4.178831 1.107640e-04
virginica 0.07643349 0.02046418 3.734989 9.965872e-04
ftable
can be used to view this array in various 2d ways. Also this link provides a function ftable2df that can convert an ftable to a data frame. For an example of using ftable:
ftable(co, row.vars = 3:2)
giving:
setosa versicolor virginica
a Estimate 1.068204e+00 1.412344e+00 1.795394e+00
Std. Error 1.729358e-01 2.302122e-01 2.453925e-01
t value 6.176882e+00 6.134967e+00 7.316418e+00
Pr(>|t|) 3.226364e-08 1.314458e-07 1.170751e-08
b Estimate 2.322613e-01 1.131989e-01 7.643349e-02
Std. Error 3.189911e-02 2.708865e-02 2.046418e-02
t value 7.281121e+00 4.178831e+00 3.734989e+00
Pr(>|t|) 5.122595e-10 1.107640e-04 9.965872e-04
and as another example
ftable(co, row.vars = c(1, 3))
giving:
Estimate Std. Error t value Pr(>|t|)
setosa a 1.068204e+00 1.729358e-01 6.176882e+00 3.226364e-08
b 2.322613e-01 3.189911e-02 7.281121e+00 5.122595e-10
versicolor a 1.412344e+00 2.302122e-01 6.134967e+00 1.314458e-07
b 1.131989e-01 2.708865e-02 4.178831e+00 1.107640e-04
virginica a 1.795394e+00 2.453925e-01 7.316418e+00 1.170751e-08
b 7.643349e-02 2.046418e-02 3.734989e+00 9.965872e-04
Also fm
and summary(fm)
are S3 objects, i.e. lists with a class variable. The components of the lists are available and can be viewed via:
str(fm)
str(summary(fm))
fm
has class vector c("nlsList", "lmList")
so you can find all the methods that are available for acting on fm
like this:
methods(class = "nlsList")
methods(class = "lmList")
In fact fm
has one component for each level of the grouping variable and each of those components is an nls
object and we can get the methods that can be applied to nls
objects like this:
methods(class = "nls")
Similarly summary(fm)
has class vector c("summary.nlsList", "summary.lmList")
and so we can find the methods that can be applied to it using:
methods(class = "summary.nlsList")
methods(class = "summary.lmList")
2) nls Using plain nls
we can create a single model by specifying the grouping variable as an index on each parameter. The starting values is a list containing a vector component for each parameter having an entry for each level of the grouping vector. Species has 3 levels so we have 3 starting values for a
and 3 for b
.
st <- list(a = rep(0.5, 3), b = rep(0.5, 3))
fm <- nls(Sepal.Width ~ a[Species] * exp(Sepal.Length*b[Species]), iris, start = st)
co <- coef(summary(fm))
co
giving this matrix:
Estimate Std. Error t value Pr(>|t|)
a1 1.06820417 0.17293582 6.176882 6.329948e-09
a2 1.41233648 0.23021095 6.134967 7.804013e-09
a3 1.79539324 0.24539239 7.316418 1.638244e-11
b1 0.23226125 0.03189911 7.281121 1.983862e-11
b2 0.11319979 0.02708865 4.178864 5.058757e-05
b3 0.07643355 0.02046418 3.734992 2.697815e-04
You could consider changing the row names to:
rownames(co) <- c(t(outer(c("a", "b"), levels(iris$Species), paste, sep = ".")))
co
giving:
Estimate Std. Error t value Pr(>|t|)
a.setosa 1.06820417 0.17293582 6.176882 6.329948e-09
a.versicolor 1.41233648 0.23021095 6.134967 7.804013e-09
a.virginica 1.79539324 0.24539239 7.316418 1.638244e-11
b.setosa 0.23226125 0.03189911 7.281121 1.983862e-11
b.versicolor 0.11319979 0.02708865 4.178864 5.058757e-05
b.virginica 0.07643355 0.02046418 3.734992 2.697815e-04
As in (1) we can inspect the components of fm using str(fm)
and can find the methods that can act on fm
using methods(class = "nls")
.
Upvotes: 2
Reputation: 1163
More generally, if you want to extract parameters from a model, you can do so in a similar manner to selecting columns from a dataframe. For example, in the case of nls
you can type ?nls
to check what you can return from the model (look under values). For example, nls$model
to get the model name, nls$weights
to get the weights and so on. Notably, the same approach can be used for correlations. You can type ?cor.test
(and look under values) to see the pAramters you can extract. cor.test(x,y)$estimate
Upvotes: 1
Reputation: 4243
One option is to use broom::tidy
with tidyr::unnest
.
library(tidyverse)
iris %>%
group_by(Species) %>%
do(model = nls(Sepal.Width ~ a*exp(Sepal.Length*b), start = list(a = 0.5, b = 0.9), data = .)) %>%
ungroup() %>%
mutate(tidy_nls = lapply(model, broom::tidy)) %>%
unnest(tidy_nls)
#------
# A tibble: 6 x 7
Species model term estimate std.error statistic p.value
<fct> <list> <chr> <dbl> <dbl> <dbl> <dbl>
1 setosa <nls> a 1.07 0.162 6.58 3.23e- 8
2 setosa <nls> b 0.232 0.0299 7.76 5.12e-10
3 versicolor <nls> a 1.41 0.228 6.18 1.31e- 7
4 versicolor <nls> b 0.113 0.0269 4.21 1.11e- 4
5 virginica <nls> a 1.80 0.261 6.87 1.17e- 8
6 virginica <nls> b 0.0764 0.0218 3.51 9.97e- 4
Upvotes: 3