Reputation: 3236
I have some data I would like to visualize and I am able to make the chart on the left, but I would like to be able to provide more information to the viewer by implementing features like those in the image to the right: shaded based on predefined ranges, and percentage of area in each range.
I recognize that this question is similar to these two answers, however I don't understand densities enough to get the dataframe in the correct format:
Here is the code that replicates my example.
If you can, please use dplyr in your response.
Thank you in advance.
library(dplyr)
library(ggplot2)
options(scipen = 999)
#Get percentages
diamonds%>%
mutate(Cut = cut,
Prices = cut(price,
breaks=c(0,2499,4999, 19000),
include.lowest=T, dig.lab=10))%>%
group_by(Cut, Prices)%>%
summarise(Count = n())%>%
group_by(Cut)%>%
mutate(Pct = round(Count/sum(Count), 2))%>%
ungroup()
#Plot
ggplot(diamonds, aes(x=price))+
geom_density(fill="grey50")+
facet_grid(cut~.)+
geom_vline(xintercept = c(2500,5000))+
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank())
Upvotes: 3
Views: 1909
Reputation: 4220
The problem is that you don't have density data in the diamonds
data.frame
. The other problem that you face is that you need to keep the facet information. I'm not sure how to get dplyr
to group by cut
and get density()
. One can generate summary data as you did but in order to make a density plot you're going to need x,y information for each point.
One workaround I found is making the density plot p like this
p<-ggplot(diamonds,aes(x=price))+geom_density()+facet_wrap(~cut,nrow=5)
And then ussing ggplot_build function to get the data that is being plotted
pg <- ggplot_build(p)
This gets you a list where the first element is the actual dataset
pg_data<-data.frame(pg$data[[1]],stringsAsFactors = F)
You can check that you're interested in y column (which is the same as density) x is going to be the price and PANEL is going to be the facet. I didn't change this to the factor mode with Good, Very Good... but I'm guessing you can.
head(pg_data)
y x density scaled count n PANEL group
1 0.00005370272 326.0000 0.00005370272 0.2756038 0.08646139 1610 1 -1
2 0.00005829975 362.1977 0.00005829975 0.2991959 0.09386259 1610 1 -1
3 0.00006307436 398.3953 0.00006307436 0.3236993 0.10154972 1610 1 -1
4 0.00006798165 434.5930 0.00006798165 0.3488836 0.10945045 1610 1 -1
5 0.00007298816 470.7906 0.00007298816 0.3745772 0.11751094 1610 1 -1
6 0.00007807049 506.9883 0.00007807049 0.4006598 0.12569348 1610 1 -1
ymin ymax fill weight colour alpha size linetype
1 0 0.00005370272 NA 1 black NA 0.5 1
2 0 0.00005829975 NA 1 black NA 0.5 1
3 0 0.00006307436 NA 1 black NA 0.5 1
4 0 0.00006798165 NA 1 black NA 0.5 1
5 0 0.00007298816 NA 1 black NA 0.5 1
6 0 0.00007807049 NA 1 black NA 0.5 1
Now we can plot again everything but using the density data we needed
ggplot(data=pg_data,aes(x,y))+geom_line()+facet_wrap(~PANEL,nrow=5)+geom_area(data=subset(pg_data,x>2499&x<5000),aes(x,y),fill = "red", alpha = 0.5)
Upvotes: 2