Reputation: 43
I have a simple dataframe for the response measurements from a drug treatment at various doses:
drug <- c("drug_1", "drug_1", "drug_1", "drug_1", "drug_1",
"drug_1", "drug_1", "drug_1", "drug_2", "drug_2", "drug_2",
"drug_2", "drug_2", "drug_2", "drug_2", "drug_2")
conc <- c(100.00, 33.33, 11.11, 3.70, 1.23, 0.41, 0.14,
0.05, 100.00, 33.33, 11.11, 3.70, 1.23, 0.41, 0.14, 0.05)
mean_response <- c(1156, 1833, 1744, 1256, 1244, 1088, 678, 489,
2322, 1867, 1333, 944, 567, 356, 200, 177)
std_dev <- c(117, 317, 440, 200, 134, 38, 183, 153, 719,
218, 185, 117, 166, 167, 88, 50)
df <- data.frame(drug, conc, mean_response, std_dev)
I can plot these point using the following code and get the basic foundation of the visualization that I would like:
p <- ggplot(data=df, aes(y=mean_response, x= conc, color = drug)) +
geom_pointrange(aes(ymax = (mean_response + std_dev), ymin = (mean_response - std_dev))) +
scale_x_log10()
p
The next thing I would like to do with these data is add a sigmoidal curve to the plot, that fits the plotted points for each drug. Following that, I would like to calculate the EC50 for this curve. I realize I may not have the entire range of the sigmoidal curve in my data, but I am hoping to get the best estimate I can with what I have. Also, the final point for drug_1 does not follow the expected trend of a sigmoidal curve, but this is actually not unexpected as the solutions that the drug is in can inhibit responses at high concentrations (each drug is in a different solution). I would like to exclude this point from the data.
I am getting stuck at the step of fitting a sigmoidal curve to my data. I have looked over some other solutions to fitting sigmoidal curves to data but none seem to work.
One post that is very close to my problem is this: (sigmoid) curve fitting glm in r
Based on it, I tried:
p + geom_smooth(method = "glm", family = binomial, se = FALSE)
This gives the following error, and seems to default to plotting straight lines:
`geom_smooth()` using formula 'y ~ x'
Warning message:
Ignoring unknown parameters: family
I have also tried the solution from this link: Fitting a sigmoidal curve to this oxy-Hb data
In this case, I get the following error:
Computation failed in `stat_smooth()`:
Convergence failure: singular convergence (7)
and no lines are added to the plot.
I have tried looking up both of these errors but cannot seem to find a reason that makes sense with my data.
Any help would be much appreciated!
Upvotes: 3
Views: 6414
Reputation: 226182
As I said in a comment, I would only use geom_smooth()
for a very easy problem; as soon as I run into trouble I use nls
instead.
My answer is very similar to @Duck's, with the following differences:
nls2
package, which provides a slightly more robust algorithmSSlogis()
to get automatic (self-starting) initial parameter selectionggplot2
, then feed it into geom_line()
p1 <- nls(mean_response~SSlogis(conc,Asym,xmid,scal),data=df,
subset=(drug=="drug_1" & conc<100)
## , weights=1/std_dev^2 ## error in qr.default: NA/NaN/Inf ...
)
library(nls2)
p1B <- nls2(mean_response~SSlogis(conc,Asym,xmid,scal),data=df,
subset=(drug=="drug_1" & conc<100),
weights=1/std_dev^2)
p2 <- update(p1,subset=(drug=="drug_2"))
p2B <- update(p1B,subset=(drug=="drug_2"))
pframe0 <- data.frame(conc=10^seq(log10(min(df$conc)),log10(max(df$conc)), length.out=100))
pp <- rbind(
data.frame(pframe0,mean_response=predict(p1,pframe0),
drug="drug_1",wts=FALSE),
data.frame(pframe0,mean_response=predict(p2,pframe0),
drug="drug_2",wts=FALSE),
data.frame(pframe0,mean_response=predict(p1B,pframe0),
drug="drug_1",wts=TRUE),
data.frame(pframe0,mean_response=predict(p2B,pframe0),
drug="drug_2",wts=TRUE)
)
library(ggplot2); theme_set(theme_bw())
(ggplot(df,aes(conc,mean_response,colour=drug)) +
geom_pointrange(aes(ymin=mean_response-std_dev,
ymax=mean_response+std_dev)) +
scale_x_log10() +
geom_line(data=pp,aes(linetype=wts),size=2)
)
I believe the EC50 is equivalent to the xmid
parameter ... note the large differences between weighted and unweighted estimates ...
Upvotes: 4
Reputation: 39595
I would suggest next approach which is close to what you want. I also tried with a setting for your data using binomial
family but there are some issues about values between 0 and 1. In that case you would need an additional variable in order to determine the respective proportions. The code in the following lines use a non linear approximation in order to sketch your output.
Initially, the data:
library(ggplot2)
#Data
df <- structure(list(drug = c("drug_1", "drug_1", "drug_1", "drug_1",
"drug_1", "drug_1", "drug_1", "drug_1", "drug_2", "drug_2", "drug_2",
"drug_2", "drug_2", "drug_2", "drug_2", "drug_2"), conc = c(100,
33.33, 11.11, 3.7, 1.23, 0.41, 0.14, 0.05, 100, 33.33, 11.11,
3.7, 1.23, 0.41, 0.14, 0.05), mean_response = c(1156, 1833, 1744,
1256, 1244, 1088, 678, 489, 2322, 1867, 1333, 944, 567, 356,
200, 177), std_dev = c(117, 317, 440, 200, 134, 38, 183, 153,
719, 218, 185, 117, 166, 167, 88, 50)), class = "data.frame", row.names = c(NA,
-16L))
In a non linear least squares, you need to define initial values for the search of ideal parameters. We use next code with base function nls()
to obtain those initial values:
#Drug 1
fm1 <- nls(log(mean_response) ~ log(a/(1+exp(-b*(conc-c)))), df[df$drug=='drug_1',], start = c(a = 1, b = 1, c = 1))
#Drug 2
fm2 <- nls(log(mean_response) ~ log(a/(1+exp(-b*(conc-c)))), df[df$drug=='drug_2',], start = c(a = 1, b = 1, c = 1))
With this initial approach of parameters, we sketch the plot using geom_smooth()
. We again use nls()
to find the right parameters:
#Plot
ggplot(data=df, aes(y=mean_response, x= conc, color = drug)) +
geom_pointrange(aes(ymax = (mean_response + std_dev), ymin = (mean_response - std_dev))) +
geom_smooth(data = df[df$drug=='drug_1',],method = "nls", se = FALSE,
formula = y ~ a/(1+exp(-b*(x-c))),
method.args = list(start = coef(fm1),
algorithm='port'),
color = "tomato")+
geom_smooth(data = df[df$drug=='drug_2',],method = "nls", se = FALSE,
formula = y ~ a/(1+exp(-b*(x-c))),
method.args = list(start = coef(fm0),
algorithm='port'),
color = "cyan3")
The output:
Upvotes: 1