Unknown
Unknown

Reputation: 155

Only implemented on univariate time series

I am trying to implement time series analysis on my data set. Originally my data set has the following attributes.

[1] "Customer"        "Customer.No"     "Shop"            "Invoice"        
[5] "Quantity"        "Sales"           "Cash.Amt"      "Credit.Card.Amt"
[9] "Net.Sales"       "Mens.Wear"       "Womens.Wear"     "Kids.Wear"      
[13] "Foot.Wear"       "Fragrant"        "Class"           "Date"           
[17] "Year"            "Month"

But I took only Year & Sales in my data set for implementing Time Series. When I try to run arima function it gives this error "only implemented for univariate time series"

data.ts<- as.ts(myData) 
is.ts(data.ts) 
class(data.ts) 
plot(data.ts) 
frequency(data.ts) 
plot(log(data.ts)) 
plot(diff(log(data.ts))) 
acf(data.ts) 
acf(diff(log(data.ts))) 
#p=0 
pacf(diff(log(AirPassengers)))
#q=0
fit <- arima(log(data.ts), c(0, 1, 0), seasonal = list(order = c(0, 1, 0), period = 1))

Can anybody please tell me if I am taking the right attributes for implementing time series? Also, why is this error coming? How can I solve this?

These are the first 6 observations of my dataset.

   Sales  Year
[1,]   707  2016
[2,]   306  2016
[3,]   394  2016
[4,]   306  2016
[5,]   491  2016
[6,]   306  2016

Years are 2016,2017 & 2018 for which there are different values of Sales.

Upvotes: 1

Views: 5679

Answers (1)

Wil
Wil

Reputation: 3178

You are receiving the error because you are passing a dataframe to the arima() function when it expects a univariate time series. You can eliminate the error by correctly defining your time series as follows:

data.ts <- as.ts(myData$Sales) 

You can then call your arima() function.

fit <- arima(log(data.ts), c(0, 1, 0), seasonal = list(order = c(0, 1, 0), period = 1))

However, I am unsure if this the exact result you want. There are multiple values for the year 2016 in your data, and based on the column names in your dataset you have monthly data. If this is the case, I suspect that setting period = 1 will lead to undesirable results, because you are indicating a period of 1 when you have monthly data whereas a period of 1 often represents annual data. You can see ?ts for more information, but if you have monthly data you would want to define your time series in this way:

data.ts <- ts(myData$Sales, frequency = 12, start = c(2016,1))

This indicates that you have monthly data (frequency = 12) beginning in the first month of 2016 (start = c(2016,1)). As another example, if you had monthly data beginning in April 2016 you would set frequency = 12 and start = c(2016,4).

Upvotes: 2

Related Questions