Reputation: 187
I am trying to plot the monthly rainfall data from 1986 to 2016 using ggplot. My dataframe looks like this:
head(df)
Year Month Station Rainfall Remarks
1 1986 Jan stn1 0.0 Observed
2 1986 Feb stn1 10.4 Observed
3 1986 Mar stn1 16.5 Estimated
4 1986 Apr stn1 34.0 Observed
5 1986 May stn1 27.0 Observed
6 1986 Jun stn1 159.4 Observed
str(df)
'data.frame': 1488 obs. of 5 variables:
$ Year : chr "1986" "1986" "1986" "1986" ...
$ Month : Ord.factor w/ 12 levels "Jan"<"Feb"<"Mar"<..: 1 2 3 4 5 6 7 8 9 10 ...
$ Station : Factor w/ 4 levels "stn1","stn2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Rainfall: num 0 10.4 16.5 34 27 ...
$ Remarks : Factor w/ 2 levels "Estimated","Observed": 2 2 1 2 2 2 2 2 2 2 ...
I tried the following code:
library(ggplot2)
ggplot(df, aes(x=Year, y=Rainfall, col=Station)) + geom_line()
However the above code results in vertical lines plot, while I want to have smooth varying lines.
I want to plot all the four station (stn1 to stn4) such that the color of each line be based on the df$Remarks. Also is it possible to have unique color for each station?
Your help would be appreciated
Upvotes: 0
Views: 479
Reputation: 30474
Here is one approach if you create a month-year variable:
library(ggplot2)
library(zoo)
df$Mo_Yr <- as.yearmon(paste0(df$Year, '-', df$Month), "%Y-%b")
ggplot(df, aes(x=Mo_Yr, y=Rainfall, col=Station)) +
geom_line() +
scale_x_yearmon()
If you want to use different color points for Remarks
(Observed and Estimated), for a single Station
, you could try the following:
ggplot(df, aes(x=Mo_Yr, y=Rainfall)) +
geom_point(aes(col = Remarks)) +
geom_line() +
scale_x_yearmon()
If you want to plot 2 lines for Observed
and Estimated
, you could add col
argument to geom_line
as below. Note I added some example data to illustrate. Depending on what data you have available this may (or may not) be what you need.
ggplot(df, aes(x=Mo_Yr, y=Rainfall)) +
geom_line(aes(col=Remarks)) +
scale_x_yearmon()
Data (for last example)
df <- read.table(text =
"Year Month Station Rainfall Remarks
1986 Jan stn1 0.0 Observed
1986 Feb stn1 10.4 Observed
1986 Mar stn1 16.5 Estimated
1986 Apr stn1 34.0 Observed
1986 May stn1 27.0 Observed
1986 Jun stn1 159.4 Observed
1986 Jul stn1 83.1 Estimated
1986 Aug stn1 55.7 Observed
1986 Sep stn1 12.3 Estimated", header = T, stringsAsFactors = T)
Upvotes: 2
Reputation: 90
You might want to try passing the stat_smooth parameter
ggplot(df) +
geom_line(aes(y= Rainfall, x= Year, color= Station)) +
stat_smooth(aes(y= Rainfall, x= Year), method = lm, formula = y ~ poly(x, 10), se = FALSE)
Upvotes: 1