Reputation: 203
I am trying to plot different series (Columns) using ggplot2. But I can't seem to be able to produce the plot.
Here is what my data looks like (goes from 1976 to 2017):
Year Atlantic Prairie Ter Ontario BC Quebec Canada
1976 1.2638857 0.4546927 NA 0.6815441 0.7264928 1.0050021 0.8424173
1977 1.1722437 0.4819217 NA 0.5951699 0.7264113 0.8883986 0.7701221
1978 1.1990781 0.4870121 NA 0.5737307 0.7684976 0.8672100 0.7604538
1979 1.1287050 0.4333563 NA 0.5194313 0.6579418 0.8407571 0.7086144
1980 1.1133467 0.4198007 NA 0.5313260 0.5992944 0.7677071 0.6745683
Here is my code that I got from similar questions here in SO.
library("reshape2")
coverage <- read.xlsx(. . .)
Tall_data <- melt(coverage , id.vars = "Year", variable.name = "series") #Step1: Rearrange data in tall format
ggplot(Tall_data, aes(Year,value)) + geom_line(aes(colour = series)) #Step2: Plot
In Step 1, R gives me the message: "attributes are not identical across measure variables; they will be dropped"
I have attached the produced plot - it is bizarre.
Note that some data in Column "Territories" is NA (from 1976 to 2003).
I know I can do it this way:
ggplot(coverage, aes(Year)) +
labs(y= "The B/U Ratio") +
geom_line(aes(y=Atlantic_Provinces), colour="green") +
geom_line(aes(y=Prairie_Provinces), colour="red") + ...
But I want to be able to do it with one single command. Also, doing them individually does not give me the legend names. I have looked into other R guides that more or less suggests the same code that I have used. But for some reason its not working for me.
Here are two similar posts:
Plot multiple columns on the same graph in R.
How to plot all the columns of a data frame in R - This is the one I followed.
Upvotes: 0
Views: 168
Reputation: 6769
Is this the right direction?
dat <-
"Year Atlantic Prairie Ter Ontario BC Quebec Canada
1976 1.2638857 0.4546927 NA 0.6815441 0.7264928 1.0050021 0.8424173
1977 1.1722437 0.4819217 NA 0.5951699 0.7264113 0.8883986 0.7701221
1978 1.1990781 0.4870121 NA 0.5737307 0.7684976 0.8672100 0.7604538
1979 1.1287050 0.4333563 NA 0.5194313 0.6579418 0.8407571 0.7086144
1980 1.1133467 0.4198007 NA 0.5313260 0.5992944 0.7677071 0.6745683
"
df <- read.delim(textConnection(dat), sep="")
library(tidyverse)
tall_df <- pivot_longer(df,
cols = c("Atlantic", "Prairie", "Ter", "Ontario", "BC", "Quebec", "Canada"),
names_to = "region"
)
ggplot(tall_df, aes(x = Year, y = value, color=region)) +
geom_line()
Upvotes: 1