Reputation: 547
The colors in my density plot are wrong! I can't figure out why.
Here is my data: https://pastebin.com/0jqHgvxx
data %>%
ggplot(aes(x=amountremain, color=black)) +
geom_density()
When I check the raw data, I see that the red peak at x=0 is correct but the max x value corresponds to the a y-value for the red not blue line.
The max x value for black = TRUE is 162414.6, max x for black = FALSE is 253021.3 so the tail should be red not blue.
b <- unclass(density(data$amountremain[data$black==FALSE]))
max(b$y)
max(b$x)
[1] 0.0003079798
[1] 253021.3
a <- unclass(density(data$amountremain[data$black==TRUE]))
max(a$y)
max(a$x)
[1] 0.0002832889
[1] 162414.6
Upvotes: 0
Views: 145
Reputation: 2485
If you look at a different scale on the y-axis you can see that the last non-zero value of TRUE
is just about 160000, while the last non-zero value of FALSE
is about 250000 as it should be.
So the representation is correct but it is difficult to see the tails.
To see:
data %>%
ggplot(aes(x=amountremain, color=black)) +
geom_density() +
ylim(0, 10^-5)
EDIT
@MrFlick explained why the line doesn't break.
If your goal is to interrupt the distribution of TRUE
on the last value, one possible solution is to create two distinct density dataframes:
to_dens <- function(df) {
d <- density(df)
df_d <- tibble(x = d$x, y = d$y)
return(df_d)
}
df1 <- df %>%
filter(black == TRUE) %>%
summarise(to_dens(amountremain))
df2 <- df %>%
filter(black == FALSE) %>%
summarise(to_dens(amountremain))
ggplot() +
geom_line(data = df1, aes(x = x, y = y), col = "steelblue3") +
geom_line(data = df2, aes(x = x, y = y), col = "firebrick2")
Upvotes: 2