Reputation: 438
I want to convert the following data into a time series - so I can use autoplot()
How do I do this so that the "Year" column is what would be on the x-axis? (I know the format for date has to be 01-01-2006, I'm ok with that):
Team PTS W GF GA S SA Year
NSH 88 38 214 233 2382 2365 2014
NSH 104 47 226 202 2614 2304 2015
NSH 96 41 224 213 2507 2231 2016
NSH 94 41 238 220 2557 2458 2017
NSH 117 53 261 204 2641 2650 2018
Using as.ts()
yields the Year column in some very large & unusable numbers. Thanks! I want to use the new time series frame for forecasting: ARIMA, VARs, etc.
Upvotes: 1
Views: 826
Reputation: 26373
Does this give you what you want:
df_ts <- ts(df[ , setdiff(names(df), c("Team", "Year"))],
start = 2014,
end = 2018,
frequency = 1)
#[1] "mts" "ts" "matrix"
I excluded the columns Team
and Year
from the coercion because the Year
seems uneeded and Team
is of type character. From ?ts
Time series must have at least one observation, and although they need not be numeric there is very limited support for non-numeric series.
Use ggfortify::autoplot.ts
for plotting
df <- structure(list(Team = c("NSH", "NSH", "NSH", "NSH", "NSH"), PTS = c(88L,
104L, 96L, 94L, 117L), W = c(38L, 47L, 41L, 41L, 53L), GF = c(214L,
226L, 224L, 238L, 261L), GA = c(233L, 202L, 213L, 220L, 204L),
S = c(2382L, 2614L, 2507L, 2557L, 2641L), SA = c(2365L, 2304L,
2231L, 2458L, 2650L), Year = 2014:2018), .Names = c("Team",
"PTS", "W", "GF", "GA", "S", "SA", "Year"), class = "data.frame", row.names = c(NA,
One way to show missing observation in your plot would be to turn implicit missing observations into explicit missing observations. I will use tidyr
's complete()
df_complete <- complete(df_incomplete, Year = min(Year):max(Year))
df_complete_ts <- ts(df_complete[ , setdiff(names(df_complete), c("Team", "Year"))],
start = 2011,
frequency = 1)
df_incomplete <- structure(list(Team = c("NSH", "NSH", "NSH", "NSH", "NSH", "NSH",
"NSH"), PTS = c(88L, 88L, 88L, 104L, 96L, 94L, 117L), W = c(38L,
38L, 38L, 47L, 41L, 41L, 53L), GF = c(214L, 214L, 214L, 226L,
224L, 238L, 261L), GA = c(233L, 233L, 233L, 202L, 213L, 220L,
204L), S = c(2382L, 2382L, 2382L, 2614L, 2507L, 2557L, 2641L),
SA = c(2365L, 2365L, 2365L, 2304L, 2231L, 2458L, 2650L),
Year = c(2011L, 2012L, 2014L, 2015L, 2016L, 2017L, 2018L)), .Names = c("Team",
"PTS", "W", "GF", "GA", "S", "SA", "Year"), class = "data.frame", row.names = c(NA,
Upvotes: 1
Reputation: 21
I have had success using ts() function in R. The code would look something like this for yearly data.
df <- ts(data, frequency = 1, start = 2014)
This should give you the results you want.
Upvotes: 2