Reputation: 155
I am trying to implement time series analysis on my data set. Originally my data set has the following attributes.
[1] "Customer" "Customer.No" "Shop" "Invoice"
[5] "Quantity" "Sales" "Cash.Amt" "Credit.Card.Amt"
[9] "Net.Sales" "Mens.Wear" "Womens.Wear" "Kids.Wear"
[13] "Foot.Wear" "Fragrant" "Class" "Date"
[17] "Year" "Month"
But I took only Year & Sales in my data set for implementing Time Series. When I try to run arima function it gives this error "only implemented for univariate time series"
data.ts<- as.ts(myData)
is.ts(data.ts)
class(data.ts)
plot(data.ts)
frequency(data.ts)
plot(log(data.ts))
plot(diff(log(data.ts)))
acf(data.ts)
acf(diff(log(data.ts)))
#p=0
pacf(diff(log(AirPassengers)))
#q=0
fit <- arima(log(data.ts), c(0, 1, 0), seasonal = list(order = c(0, 1, 0), period = 1))
Can anybody please tell me if I am taking the right attributes for implementing time series? Also, why is this error coming? How can I solve this?
These are the first 6 observations of my dataset.
Sales Year
[1,] 707 2016
[2,] 306 2016
[3,] 394 2016
[4,] 306 2016
[5,] 491 2016
[6,] 306 2016
Years are 2016,2017 & 2018 for which there are different values of Sales.
Upvotes: 1
Views: 5679
Reputation: 3178
You are receiving the error because you are passing a dataframe to the arima()
function when it expects a univariate time series. You can eliminate the error by correctly defining your time series as follows:
data.ts <- as.ts(myData$Sales)
You can then call your arima()
function.
fit <- arima(log(data.ts), c(0, 1, 0), seasonal = list(order = c(0, 1, 0), period = 1))
However, I am unsure if this the exact result you want. There are multiple values for the year 2016 in your data, and based on the column names in your dataset you have monthly data. If this is the case, I suspect that setting period = 1
will lead to undesirable results, because you are indicating a period of 1 when you have monthly data whereas a period of 1 often represents annual data. You can see ?ts
for more information, but if you have monthly data you would want to define your time series in this way:
data.ts <- ts(myData$Sales, frequency = 12, start = c(2016,1))
This indicates that you have monthly data (frequency = 12
) beginning in the first month of 2016 (start = c(2016,1)
). As another example, if you had monthly data beginning in April 2016 you would set frequency = 12
and start = c(2016,4)
.
Upvotes: 2