Barry
Barry

Reputation: 267

Need second legend in ggplot for geom_hline's

I am a relative newbie to R and I'm writing code to use ggplot2 to create a chart from some pretty basic data. My plotting code currently looks like this:

chart1_data = read.csv(data_file, as.is=T)
chart1_means = read.csv(means_file, as.is=T)

p = ggplot(data=chart1_data, aes(x=entity, y=usage, fill=medicine)) +
geom_bar(stat="identity", position=position_dodge()) +
geom_hline(data=chart1_means, aes(yintercept=value), show.legend=FALSE)

This draws a chart of grouped vertical bars with black horizontal lines across the bars representing mean values and has a legend showing the color coding of the vertical bars.

I want to do a couple of things:

  1. display the horizontal lines (geom_hline) in colors (the default set of colors that R has, mapped to the varying number of lines in the chart1_means table)

  2. show a second legend that shows these line colors and maps to the column 1 value in the chart1_means file which is a textual label.

For clarity the chart1_means CSV file looks like this:

label,value
USA Codeine mean, 14.2
Canada Codeine mean, 12.7
etc.

And the chart1_data CSV file looks like this:

year,medicine,entity,usage,units
2006,Codeine,Mexico,0.8,mg/capita
2006,Codeine,Cuba,NA,mg/capita
etc.

I have Googled unsuccessfully with this. There seem to be lots of ways to do similar things but nothing I can find that is quite applicable.

UPDATE UPDATE UPDATE

I took bethanyP's advice in designing something that is closer to correct but still wrong. Code currently looks like this:

chart1_data = read.csv(data_file, as.is=T)
chart1_means = read.csv(means_file, as.is=T)

means_labels = chart1_means$label
colors = rainbow(length(means_labels))

p = ggplot(data=chart1_data, aes(x=entity, y=usage, fill=medicine)) +
geom_bar(stat="identity", position=position_dodge(), show.legend=TRUE) +
geom_hline(data=chart1_means, aes(yintercept=value), color=colors) +
scale_fill_manual("means", values=colors, guide=guide_legend(override.aes = list(colors)))

The result is colored lines overlaying colored bars (good) but still only one legend. The legend has the title of "means" (the line oriented data) but shows the colors and labels of the "medicines" (the bar oriented data).

I thought I might be able to do this instead:

scale_fill_manual("means", values=colors, labels=means_labels)

but this fills in the single legend with title "means", colors associated with the bars, and labels that are a subset of the "means_labels" (since there are fewer bars than means lines).

I'm pretty much at an impasse. Still need two legends from the two different data series. Any other suggestions?

Upvotes: 0

Views: 618

Answers (1)

sconfluentus
sconfluentus

Reputation: 4993

Add the show_guide = TRUE to the geom_bar aesthetic to explicitly tell it to make a legend for the content of your plot for the primary legendpurple.

+ geom_bar(stat="identity", position=position_dodge(), show_guide = TRUE)

You can get the color in the hline simply by adding the argument, color = "red" into the geom_hline. While you are at it, add a fill="some text here" argument to label the lines in your second legend, especially if you are adding multiple lines of varying colors.

+ geom_hline(data=chart1_means, aes(yintercept=value), fill="mean", color="red")
+ geom_hline(data=chart1_sd, aes(yintercept=value), fill="Standard Deviation", color="pcreateurple")

Then you can use can use the scale_fill with the guide=guide_legend to get you the rest of the way home.

+ scale_fill_manual("Means & SD", guide=guide_legend(override.aes = list(color=c('purple", "red")))

)

You can also create a variable to hold the list of colors and designate them by the using the variable name and an integer to specify which color in the list to use for each if you are using a palette as a container.

If you have problems with colors and lines not being where you expect and the legend having the wrong stuff in it, look at the order of your lines of ggplot functions.

If you have the scale_fill_manual before the geom_hline it may color and build a legend for the wrong aesthetic, like your bars. Cut and paste until you have them in the right spot.

ggplot is amazingly powerful and completely NOT intuitive. I keep a cheat sheet with me at all times to help with this stuff. There is no shame in it. Rstudio has a good one.

Upvotes: 1

Related Questions