Reputation: 11762
I am trying to make a scatter plot and also plot a regression line for my data.
Before plotting, I want to have the NAs replaced by a fixed number to get all points in my graph and since they are all on one line, they are easily visible...
But in this way it messes up my geom_smooth
. Is there a better solution to get the missing values replaced by a fixed number but the geom_smooth without the NAs?
set.seed(1234)
df <- data.frame(x=rnorm(100),
y=c(rnorm(40), rep(NA,60)))
df[is.na(df)] <- -5
ggplot(df, aes(x,y)) + geom_point() + geom_smooth(method="lm", fullrange=TRUE)
As you can see in the example, the smooth line moves to the "imputed" values.
Upvotes: 1
Views: 1681
Reputation: 49033
One way to do it is to store your data into two different data frames :
df2 <- df
df2[is.na(df2)] <- -5
And plot them into two different layers :
ggplot() + geom_point(data=df2, aes(x,y)) + geom_smooth(data=df, aes(x,y), method="lm", fullrange=TRUE)
But maybe a cleaner way to do it would be to use geom_rug()
, something like this :
dfna <- df[is.na(df$y),]
ggplot(df, aes(x,y)) + geom_point() + geom_smooth(method="lm", fullrange=TRUE) + geom_rug(data=dfna, aes(x))
Which gives :
Upvotes: 5