Reputation: 335
I have data in a CSV file like this:
Year,A,B,C,D,E,F,G
2007,3.35,,,,,,
2008,3.54,3.59,,,,,
2009,3.22,3.46,4.43,,,,
2010,3.82,3.63,4.64,,,,
2011,2.91,3.74,4.5,4.13,4.38,,
2012,3.85,3.57,4.13,4,4,4,
2013,4.33,2.93,4.63,4.71,4.25,,
2014,4.73,4,4.81,4.66,4.33,,4
2015,,,4.89,4.68,,,
I'm trying to plot it like this:
scores_raw = read.csv("scores.csv", header = TRUE, fill = TRUE)
scores_melt <- melt(scores_raw, id = "Year")
scores_symb <- c(15, 17, 16, 16, 16, 16, 16)
plot_scores <- ggplot(scores_melt, aes(x=Year, y=value, colour=variable, shape=variable))
plot_scores +
geom_line() +
geom_point(size = 10, alpha = 0.6) +
scale_shape_manual(values = scores_symb,
name="Cohort\nSize",
labels=c("200", "100", "25")) +
ylab("Score (5 = max)") +
scale_y_continuous(limits = c(0, 5)) +
theme_bw() +
theme(
text = element_text(size=30)
, axis.title.y=element_text(vjust=1.5)
, axis.title.x=element_text(vjust=0.1)
, plot.background = element_rect(fill = "transparent",colour = NA)
, legend.justification=c(0,0), legend.position=c(0,0) #legend.position="none"
, legend.background = element_rect(fill="transparent", size=.5, linetype="dotted")
)
As you can tell, I've got 7 series but only want to tell them apart by 3 cohort sizes (i.e. shape).
I would like the legend to only show the three shapes that discriminate the three types of data I've got. At the moment, I can either produce a single legend with shapes and colours combined. Or two legends (as in the code above) that produces two legends, one with shapes (4 of which are NAs) and the other with the colours.
Help please!
Upvotes: 0
Views: 2101
Reputation: 35392
You are trying to map cohort size to shape
. Instead of hacking the scales, actually map cohort size to shape
(and not variable
). You can do this by creating a new variable called cohort_size
.
scores_raw <- read.table(text = "Year,A,B,C,D,E,F,G
2007,3.35,,,,,,
2008,3.54,3.59,,,,,
2009,3.22,3.46,4.43,,,,
2010,3.82,3.63,4.64,,,,
2011,2.91,3.74,4.5,4.13,4.38,,
2012,3.85,3.57,4.13,4,4,4,
2013,4.33,2.93,4.63,4.71,4.25,,
2014,4.73,4,4.81,4.66,4.33,,4
2015,,,4.89,4.68,,,", sep = ",", header = TRUE)
scores_symb <- c(15, 17, 16, 16, 16, 16, 16)
scores_melt <- reshape2::melt(scores_raw, id = "Year")
# Add the new variable
scores_melt$cohort_size <- scores_melt$variable
# Correctly map the levels
levels(scores_melt$cohort_size) <- scores_symb
# Reorder the levels (you might want to use lables = c(25, 100, 200) here)
scores_melt$cohort_size <- factor(scores_melt$cohort_size, levels = 15:17)
# Simplified ggplot call that looks ok on my screen
ggplot(scores_melt, aes(x = Year, y = value, colour = variable, shape = cohort_size)) +
geom_line() +
geom_point(size = 5, alpha = 0.6) +
ylab("Score (5 = max)") +
scale_colour_discrete(guide = FALSE) +
theme_bw() +
theme(legend.position=c(0.1, 0.8))
Upvotes: 1