Reputation: 1671
I use the following dataset (downloadable here) and the code (down below) trying to plot several graphs in one ggplot. I know that there are plenty of explanations out there, but still I do not seem to get the job done because I am confused about where to put the commands for ggplot to understand what I want.
I addition, I know that there a two ways raw data could be present: either in wide or long format. When I keep the data in wide format I have to write a lot in order to get the job done (see code and graph below), but when I convert it to the long format, ggplot complains about missing values (see code and error message down below).
This is my minimal code example:
library(ggplot2) # for professional graphs
library(reshape2) # to convert data to long format
WDI_GDP_annual <- WDI[which(WDI$Series.Name=='GDP growth (annual %)'),] # extract data I need from dataset
WDI_GDP_annual_short <- WDI_GDP_annual[c(-1,-2,-4)] # wide format
test_data_long <- melt(WDI_GDP_annual_short, id = "Time") # long format
# (only successful) graph with wide format data
ggplot(WDI_GDP_annual_short, aes(x = Time)) +
geom_line(aes(y = Brazil..BRA., colour = "Brazil..BRA.", group=1)) +
geom_line(aes(y = China..CHN., colour = "China..CHN.", group=1)) +
theme(legend.title = element_blank())
# several graphs possibilities to plot data in long format and to have to write less (but all complain)
ggplot(data=test_data_long, aes(x = time, y = value, colour = variable)) +
geom_line() +
theme(legend.title = element_blank())
ggplot(data=test_data_long, aes(x = time, y = value, color = factor(variable))) +
geom_line() +
theme(legend.title = element_blank())
ggplot(test_data_long, aes(x = time, y = value, colour = variable, group = variable)) +
geom_line()
This is the (only) successful plot I got so far, but I do not want to need to write so much (since I want to have 6 more graphs in this ggplot):
I know that to use he long format would mean a more elegant way how to plot the multiplot but I what ever command I use (see above) I always get the following complain:
Error: Aesthetics must either be length one, or the same length as the dataProblems:time
Does somebody know the answer to my question?
Upvotes: 4
Views: 17819
Reputation: 67778
To start with: your data have strings ".." in your supposedly numerical columns, which will convert the entire columns to class
character
(or factor
, depending on your stringsAsFactors
settings).
If you wish to treat ".." as NA
, add na.strings = ".."
to your read.xxx
call. This will ensure that the columns are treated as numeric
. str
should be your friend after you have read any data set.
library(reshape2)
library(ggplot2)
df <- read.csv(file = "https://dl.dropboxusercontent.com/u/109495328/WDI_Data.csv",
na.strings = "..")
str(df)
# compare with
# df <- read.csv(file = "https://dl.dropboxusercontent.com/u/109495328/WDI_Data.csv")
# str(df)
# melt relevant part of the data
df2 <- melt(subset(df,
subset = Series.Name == "GDP growth (annual %)",
select = -c(Time.Code, Series.Code)),
id.vars = c("Series.Name", "Time"))
ggplot(df2, aes(x = Time, y = value, colour = variable, group = variable)) +
geom_line()
Upvotes: 4