Reputation:
I am using R to fit data on a logarithmic curve with equation:
y = a * log(b * x)
My data looks like this:
#Creating example data
pre <- c(946116, 1243227, 1259646, 1434124, 1575268, 2192526, 3252832, 6076519)
post <- c(907355, 1553586, 1684253, 2592938, 1919173, 1702644,3173743, 3654198)
data <- data.frame(pre,post)
#Plotting data
ggplot(data, aes(x=pre, y=post))+
geom_point()
But when I try to fit a logarithmic curve using geom_smooth, I get an error.
# Fitting logarithmic curve
ggplot(data, aes(x=pre, y=post))+
geom_point()+
geom_smooth(method="nls", se=FALSE,
method.args=list(formula=y~a*log(b*x),
start=c(a=100, b=2)))
Warning messages:
1: In log(b * x) : NaNs produced
2: Computation failed in `stat_smooth()`:
Missing value or an infinity produced when evaluating the model
I get similar issues when I try to create a logarithmic model in nls, without using ggplot
model <- nls(data=data,
formula=y~a*log(b*x),
start=list(a=100, b=2))
Warning messages:
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In log(b * x) : NaNs produced
As someone who is new to R, I don't quite understand what the error messages are trying to tell me. I know that I need to change how I am specifying start conditions, but I don't know how.
Upvotes: 1
Views: 4916
Reputation: 510
I see a couple of problems in your nls call. 1) You're using the variables x and y, when these variables don't exist. They should be pre and post. 2) The size of numbers is giving nls trouble. It helps if you divide them by 1,000,000.
pre <- c(946116, 1243227, 1259646, 1434124, 1575268, 2192526, 3252832, 6076519)
post <- c(907355, 1553586, 1684253, 2592938, 1919173, 1702644,3173743, 3654198)
pre = pre/1000000
post = post/1000000
data <- data.frame(pre,post)
model <- nls(data=data,
formula=post~a*log(b*pre),
start=list(a=1, b=1))
summary(model)
But as shown in the previous answer, changing the form of the equation will help without needing to change the scale of the data.
pre <- c(946116, 1243227, 1259646, 1434124, 1575268, 2192526, 3252832, 6076519)
post <- c(907355, 1553586, 1684253, 2592938, 1919173, 1702644,3173743, 3654198)
data <- data.frame(pre,post)
model <- nls(data=data,
formula=post~a*log(pre)+b,
start=list(a=1, b=0))
summary(model)
Upvotes: 0
Reputation: 12074
Try this:
ggplot(data, aes(x=pre, y=post))+
geom_point()+
geom_smooth(method="nls", se=FALSE, formula=y~a*log(x)+k,
method.args=list(start=c(a=1, k=1)))
Notice that it's essentially the same formula, but now k = a * log(b)
:
a * log(b * x) = a * {log(b) + log(x)} = a * log(x) + a * log(b) = a * log(x) + k
Upvotes: 2