Rajan
Rajan

Reputation: 426

Facing difficulty in convert a data.frame to time series object in R?

I am a novice in R language. I am having text file separated by tab available with sales data for each day. The format will be like product-id, day0, day1, day2, day3 and so on. The part of the input file given below

productid   0   1   2   3   4   5   6
1           53  40  37  45  69  105 62
4           0   0   2   4   0   8   0
5           57  133 60  126 90  87  107
6           108 130 143 92  88  101 66
10          0   0   2   0   4   0   36
11          17  22  16  15  45  32  36

I used code below to read a file

pdInfo <- read.csv("products.txt",header = TRUE, sep="\t")

This allows to read the entire file and variable x is a data frame. I would like to change data.frame x to time series object in order for the further processing.On a stationary test, Dickey–Fuller test (ADF) it shows an error. I tried the below code

x <- ts(data.matrix(pdInfo),frequency = 1)
adf <- adf.test(x)

  error: Error in adf.test(x) : x is not a vector or univariate time series

Thanks in advance for the suggestions

Upvotes: 2

Views: 6118

Answers (2)

JohanS
JohanS

Reputation: 67

library(purrr)
library(dplyr)
library(tidyr)
library(tseries)

# create the data

df <- structure(list(productid = c(1L, 4L, 5L, 6L, 10L, 11L), 
                     X0 = c(53L, 0L, 57L, 108L, 0L, 17L), 
                     X1 = c(40L, 0L, 133L, 130L, 0L, 22L), 
                     X2 = c(37L, 2L, 60L, 143L, 2L, 16L), 
                     X3 = c(45L, 4L, 126L, 92L, 0L, 15L), 
                     X4 = c(69L, 0L, 90L, 88L, 4L, 45L), 
                     X5 = c(105L, 8L, 87L, 101L, 0L, 32L), 
                     X6 = c(62L, 0L, 107L, 66L, 36L, 36L)), 
                .Names = c("productid", "0", "1", "2", "3", "4", "5", "6"), 
                class = "data.frame", row.names = c(NA, -6L))

# apply adf.test to each productid and return p.value

adfTest <- df %>% gather(key = day, value = sales, -productid) %>%
  arrange(productid, day) %>%
  group_by(productid) %>%
  nest() %>%
  mutate(adf = data %>% map(., ~adf.test(as.ts(.$sales)))
  ,adf.p.value = adf %>% map_dbl(., "p.value")) %>%
  select(productid, adf.p.value) 

Upvotes: -1

lebelinoz
lebelinoz

Reputation: 5068

In R, time series are usually in the form "one row per date", where your data is in the form "one column per date". You probably need to transpose the data before you convert to a ts object.

First transpose it:

y= t(pdInfo)

Then make the top row (being the product id's) into the row titles

colnames(y) = y[1,]
y= y[-1,] # to drop the first row

This should work:

x = ts(y, frequency = 1)

Upvotes: 4

Related Questions