Cyph
Cyph

Reputation: 81

R - Trying to plot logistic curve, default plot and using 'curve(predict' does not add in logistic regression line

I'm trying to plot a logistic regression for two of the variables in my dataset. Structure of the relevant variables -

str(df2)

 $ Threatened     : Factor w/ 2 levels "0","1": 1 2 1 1 1 1 2 1 2 2 ...
 $ tl_mmlog       : num  3.48 3.54 3.57 3.28 3.54 ...

When I try to plot a logistic curve using the default plotting function no curve is drawn. The scatterplot is created in the correct way. But when I run the 'curve(predict' line the logistic regression line is not added. No error messages are generated, just no line appears. This code worked on a smaller dataset I have so not sure if it's an issue with my dataset? Anyone know why the line isn't being drawn?

Logistic regression code -

attach(df2)

plot(x=tl_mmlog, y=Threatened)

fit2<-glm(Threatened~tl_mmlog, family=binomial)

curve(predict(fit2, data.frame(tl_mmlog=x), type="resp"), add=TRUE)

Upvotes: 0

Views: 554

Answers (1)

dcarlson
dcarlson

Reputation: 11056

Here is some data that resembles yours. I created it from a data set that is included with R called iris and used dput to create a format that is easy to import into R:

df2 <- structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4, 4.6, 
5, 4.4, 4.9, 5.4, 4.8, 4.8, 4.3, 5.8, 6.3, 5.8, 7.1, 6.3, 6.5, 
7.6, 4.9, 7.3, 6.7, 7.2, 6.5, 6.4, 6.8, 5.7, 5.8), Species = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0", 
"1"), class = "factor")), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 101L, 102L, 103L, 104L, 
105L, 106L, 107L, 108L, 109L, 110L, 111L, 112L, 113L, 114L, 115L
), class = "data.frame")
str(df2)
# 'data.frame': 30 obs. of  2 variables:
#  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#  $ Species     : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

Now compute the analysis and draw the plot:

fit2 <- glm(Species~Sepal.Length, df2, family=binomial)
with(df2, plot(Sepal.Length, Species))

Notice that the y-axis ranges from 1 to 2 because that is the value of the numeric factor values (not the character factor levels). But the predict function is going to use a range of 0 to 1 so it will not appear on your graph unless you add 1 to each value before plotting. It is probably better to convert the factor to a numeric value so that the first value is 0 and the second value is 1:

df2$Species <- as.numeric(as.character(df2$Species))
fit2 <- glm(Species~Sepal.Length, df2, family=binomial)
with(df2, plot(Sepal.Length, Species))

Now the plot ranges from 0 to 1. Next we add the curve, but we must include the range of values for the curve:

minmax <- range(df2$Sepal.Length)
curve(predict(fit2, data.frame(Sepal.Length=x), type="resp"), minmax[1], minmax[2], add=TRUE)

Plot

Upvotes: 1

Related Questions