Reputation: 9
I want to fit a model which includes a sum over a variable number of coefficients, such as here
I would like to write the code so that the number of coefficients can be specified by a potential user when running the fitting routine (i.e. setting i to any positive integer value), but I have no idea how to even start coding this. Of course I could simply write out the resulting equations for a large number of potential variables (e.g. for 1 < i < 10), and switch to the specified one at the beginning, but I would prefer a "prettier" solution (that is fewer lines), if at all available in R. I am currently using nls, but of course I would be happy with any solution.
Here's what I tried thus far:
Model <- function(X0, t, X, tau){X0+((X*tau*(exp(1/tau)-1))*exp(-t/tau))}
set.seed(1)
df <- data.frame(t=sort(sample(1:100,50)))
df$P <- jitter(with(df, Model(X0=0.1, t=t, X=0.2, tau=5)), 1000)
Model_fit <- nls(P ~ Model(X0=0.1, t=t, X=X1_fit, tau=5)
, start = list(X1_fit=c(0.1)), data=df, trace = F)
plot(df$t,df$P)
lines(df$t, predict(Model_fit, list(P = df$P)))
which works just fine. It's also easy to fit more than one coefficient, simply by passing a vector to X1_fit
. However, what I am now struggling with is how to formulate the sum in the formula correctly. I cannot just use
X=c(0.2,5)
Model <- function(X0, t, X, tau){X0+sum((X*tau*(exp(1/tau)-1))*exp(-t/tau))}
as this of course returns a single value, i.e. the dependence on t
gets lost. Instead, the code would need to take the sum only for every individual value of t
, so that the function still returns a vector of length t
. However, I am struggling to figure out how to do this without reverting to loops, which might also be difficult to fit via nls?
Cheers!
Edit
@G. Grothendieck's answer works fine, but I realised that the fitting procedure is very sensitive to just minor modifications in the equation, in particular when more than just one constant need to be fitted. For example
set.seed(1)
Model <- function(X0, t, X, tau) {
res <- X0 + rowSums(sapply(seq_along(X), function(i)
(X[i]*tau[i]/5*(exp(5/tau[i])-1))*exp(-t/tau[i])))}
df <- data.frame(t=sort(sample(1:100,50)))
df$P <- jitter(with(df, Model(X0=0.1, t=t, X=c(0.2, 2), tau=c(4, 7))), 1000)
Model_fit <- nls(P ~ Model(X0=0.1, t=t, X=X1_fit, tau=tau_fit)
, start = list(X1_fit=c(0.1, 1), tau_fit=5:6), data=df)
does not run
Error in nls(P ~ Model(X0 = 0.1, t = t, X = X1_fit, tau = tau_fit), start = list(X1_fit = c(0.1, :
step factor 0.000488281 reduced below 'minFactor' of 0.000976562
(I added a constant 5
at two places, and now also try to fit tau
), even when reducing the step-size and increasing the number of iterations. In principle, I appreciate that fitting 4 coefficients is prone to be unstable, but alternative code I have in Matlab seems to deal just fine with it (and even more fitting coefficients, for that matter). Perhaps this would be better in a separate question, but as the thread refers to "arbitrary number of fitting parameters", I feel this follow-up is best done as an edit.
Upvotes: 0
Views: 443
Reputation: 269654
sapply
over i creating a 2d matrix and use rowSums
to perform the summation:
set.seed(1)
Model <- function(X0, t, X, tau) {
res <- X0 + rowSums(sapply(seq_along(X), function(i)
(X[i]*tau[i]*(exp(1/tau[i])-1))*exp(-t/tau[i])))
}
df <- data.frame(t=sort(sample(1:100,50)))
df$P <- jitter(with(df, Model(X0=0.1, t=t, X=c(0.2, 2), tau=c(4, 7))), 1000)
Model_fit <- nls(P ~ Model(X0=0.1, t=t, X=X1_fit, tau=5:6)
, start = list(X1_fit=c(0.1, 1)), data=df, trace = F)
plot(df)
lines(fitted(Model_fit) ~ t, df, col = "red")
giving:
> Model_fit
Nonlinear regression model
model: P ~ Model(X0 = 0.1, t = t, X = X1_fit, tau = 5:6)
data: df
X1_fit1 X1_fit2
0.2107 2.0849
residual sum-of-squares: 0.5175
Number of iterations to convergence: 1
Achieved convergence tolerance: 1.511e-07
Upvotes: 1