Reputation: 109
I want to add multiple vertical lines in my density plot that start at the x-axis and end at the curve using ggplot2. I'm using the starwars dataset from dplyr. I want to plot the height variable as a normal distribution. The dashed lines inside the curve represent the standard deviations. So far I got this (just the plot without the lines):
sd.values = seq(66, 264, 34.77043)
zeros.vector = rep(0, 6)
ggplot(starwars, aes(x=height, y=dnorm(height, m=mean(height, na.rm=T), s=sd(height, na.rm=T)))) +
geom_line() + labs(x='height', y='f(height)') +
scale_x_continuous(breaks=sd.values,labels=sd.values)
density plot without lines
Now, I want to add the dashed lines using geom_segment
:
ggplot(starwars, aes(x=height, y=dnorm(height, m=mean(height, na.rm=T), s=sd(height, na.rm=T))))+
geom_line() + labs(x='height', y='f(height)') +
scale_x_continuous(breaks=sd.values, labels=sd.values) +
geom_segment((aes(x=sd.values, y=zeros.vector, xend=sd.values,
yend=dnorm(sd.values, m=mean(height, na.rm=T), s=sd(height, na.rm=T)))),
linetyp ='dashed')
But in the end, I only get the following error message:
Error: Aesthetics must be either length 1 or the same as the data (87): x, y, xend and yend
Any idea what I have to change in order to add the dashed lines?
Upvotes: 0
Views: 628
Reputation: 616
When you specify the data
argument in ggplot()
, this becomes the default dataset. All aesthetic expressions must have the same length as that dataset, unless you specify a new data for a geom. To avoid setting a default dataset, you can specify the data
argument in the geoms.
library(tidyverse)
data(starwars)
sd.values <- seq(66, 264, 34.77043)
mean_height <- mean(starwars$height, na.rm = TRUE)
sd_height <- sd(starwars$height, na.rm = TRUE)
ggplot() +
geom_line(data = starwars,
aes(x = height, y = dnorm(height, m = mean_height, sd = sd_height))) +
geom_segment(data = NULL,
aes(x = sd.values, xend = sd.values,
y = 0, yend = dnorm(sd.values, m = mean_height, sd = sd_height)),
linetype = 'dashed')
Note though that the following call will fail even though you specify data=NULL
, because ggplot2
will replace the NULL
dataset with starwars
, the default.
ggplot(data = starwars, aes(x = height, y = dnorm(height, m = mean_height, sd = sd_height))) +
geom_line() +
geom_segment(data = NULL,
aes(x = sd.values, xend = sd.values,
y = 0, yend = dnorm(sd.values, m = mean_height, sd = sd_height)))
Alternatively, you can create a new dataset and specify that.
library(tidyverse)
data(starwars)
mean_height <- mean(starwars$height, na.rm = TRUE)
sd_height <- sd(starwars$height, na.rm = TRUE)
df <- data.frame(
sd_values = seq(66, 264, 34.77043)
) %>% mutate(yend = dnorm(sd_values, mean_height, sd_height))
ggplot() +
geom_line(data = starwars,
aes(x = height, y = dnorm(height, m = mean_height, sd = sd_height))) +
geom_segment(data = df,
aes(x = sd_values, xend = sd_values,
y = 0, yend = yend),
linetype = 'dashed')
Upvotes: 2
Reputation: 8506
You need to add a new data.frame (or tibble) to the graph, which can have different dimensions. E.g. like this:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
sd.values = seq(66, 264, 34.77043)
# zeros.vector = rep(0, 6)
ggplot(starwars, aes(x=height, y=dnorm(height, m=mean(height, na.rm=T), s=sd(height, na.rm=T))))+
geom_line() + labs(x='height', y='f(height)') +
scale_x_continuous(breaks=sd.values, labels=sd.values) +
geom_segment(mapping = aes(x=SD, y=Zeros, xend=SD,
yend=dnorm(SD, m=mean(starwars$height, na.rm=T), s=sd(starwars$height, na.rm=T))),
linetype ='dashed', inherit.aes = F, data=data.frame(SD=sd.values, Zeros=rep(0, 6)))
#> Warning: Removed 6 row(s) containing missing values (geom_path).
Created on 2020-12-27 by the reprex package (v0.3.0)
Upvotes: 2