Reputation: 55
Using the CPDS (Comparative Political Data Set - downloadable here), I want to plot the variables gov_left3
, gov_cent3
and gov_right3
for New Zealand. For gov_left3
, I'm using the colour tomato3
, for gov_cent3
I use blue
and for gov_right3
darkgreen
.
When I use ggplot
to draw a geom_line
, the graph is drawn perfectly, but...
- I can draw the graph nicely without the legend
- If I want a legend (that would be pretty nice), R is somehow mixing up the colours of gov_left3
and gov_right3
in the legend.
- If i relabel the legend labels using scale_color_manual
, the colours are right in the legend, but wrong in the graph.
I have tried working with several commands starting with scale
, but not coming to a solution. I guess it's really simple, but I just don't see it.
Do I maybe need to reshape my data frame from long/wide to wide/long?
# Import data
CPDS <- readxl::read_excel("~/Daten/CPDS.xlsx")
# Filter data
NZL <- dplyr::filter(CPDS, iso == "NZL")
# Draw geom_line -> this actually gets me the right graph and a correct legend, all I want to do now is relabel my legend
n <- ggplot() +
geom_line(data=NZL, aes(y=gov_left3, x=year, colour="tomato3"), size=0.7, linetype = "dashed") +
geom_line(data=NZL, aes(y=gov_cent3, x=year, colour="blue"), size=0.7, linetype = "dashed") +
geom_line(data=NZL, aes(y=gov_right3, x=year, colour="darkgreen"), size=0.7, linetype = "dashed")
n1 <- n + geom_vline(xintercept = 1994, color="red", size=1, alpha=0.75) +
theme_minimal() +
labs(title = "Parliamentary seat share of all parties", subtitle = "New Zealand government, 1960 to 2014",
x="Year", y="Seat share in %", caption = "Source: CPDS") +
theme(plot.title = element_text(size=20, face="bold", margin = margin(10, 0, 10, 0)),
axis.text.x = element_text(angle=45, vjust=0.5), legend.title = element_text(size=12, face="bold")) +
scale_x_continuous(breaks = c(1960, 1970, 1980, 1990, 2000, 2010)) +
scale_color_identity(guide="legend")
Plotting n1 gets me the graph I want, but I want to change the legend. So, instead of colour, I want Types of parties as the title. Then, for blue, I want the label centre, for tomato3 left and for darkgreen I would like right.
I hope I provided all informations needed to help :) Thank you!
/ Edit: According to PavoDive's help, I used the melt
function to transform the data frame from wide to long. With the help of dplyr
's functions filter
and arrange
, I created a data frame containing the three columns year, variable and value, ordered by year.
But if I let ggplot
draw a graph, the result is
What am I doing wrong?
Upvotes: 1
Views: 4842
Reputation: 55
After reshaping my data from wide to long, I found out (hours later) that the column value
was stored as character value, not numeric. Therefore, I had to convert the variable to numeric using the command
mNZL2$value <- as.numeric(as.character(mNZL2$value))
.
Plotted the whole thing like this:
ggplot(data=mNZL2, aes(x=year, y=as.numeric(value), colour=variable)) +
geom_line(linetype="dashed", size=0.6) +
geom_vline(xintercept = 1996, color="red", size=1, alpha=0.75) +
theme_minimal() +
labs(title = "Parliamentary seat share of all parties",
subtitle = "New Zealand government, 1960 to 2016",
x="Year",
y="Seat share in %", caption = "Source: CPDS") +
theme(plot.title = element_text(size=20, face="bold",
margin = margin(10, 0, 10, 0)),
axis.text.x = element_text(angle=45, vjust=0.5),
legend.title = element_text(size=12, face="bold")) +
scale_x_continuous(breaks = c(1960, 1970, 1980, 1990, 2000, 2010)) +
scale_color_brewer(palette = "Accent",
name = "Party types",
labels=c("Right","Centre","Left")) +
annotate("segment", x = 2002, xend = 1997, y = 65, yend = 60,
color = "red", size = 1, alpha = 0.75, arrow = arrow()) +
annotate("text", x = 2008, y = 65, label = "First MMP election",
color = "red", fontface = "bold")
to get this:
Upvotes: 0
Reputation: 6496
You can solve that by setting new values in scale_color_*
family, but I see an underlying problem with your approach.
It seems you are working with a wide table, instead of a long one, which is giving you unwanted problems like the one you asked about. I'll use the iris
data, because you didn't provide any reproducible data:
I'll first will get a wide table with years:
dt <- iris[, 1:4] # scrapped the species variable
dt$year <- 1:150 # created a year variable
I could plot this calling each variable (Sepal-Length
, Petal.Width
, etc.) on an independent geom_line
call. But the right thing to do is converting your data from wide to long. For that I'll use data.table::melt
:
require(data.table) # library(data.table) too!
df2 <- melt(df, id.vars = "year") # check df2: is a long table now
# now the plotting part:
require(ggplot2)
ggplot(df2, aes(x = year, y = value, color = variable))+geom_line()
Now your plot has the adequate labels, and it is a simpler call. Of course you can rename the columns of df2
both at the call of melt
(arguments variable.name
and value.name
), or directly with names(df2) <-
#### EDIT to add: ####
To change the names and values in the legend, use this at the end of your ggplot chain:
+scale_color_discrete(labels = c("center", "right", "left"), name = "Political Orientation")
Upvotes: 2