user1981673
user1981673

Reputation: 35

Using R to create and merge zoo object time series from csv files

I have a large set of csv files in a single directory. These files contain two columns, Date and Price. The filename of filename.csv contains the unique identifier of the data series. I understand that missing values for merged data series can be handled when these times series data are zoo objects. I also understand that, in using the na.locf(merge() function, I can fill in the missing values with the most recent observations.

I want to automate the process of.

  1. loading the *.csv file columnar Date and Price data into R dataframes.
  2. establishing each distinct time series within the Merged zoo "portfolio of time series" objects with an identity that is equal to each of their s.
  3. merging these zoo objects time series using MergedData <- na.locf(merge( )).

The ultimate goal, of course, is to use the fPortfolio package.

I've used the following statement to create a data frame of Date,Price pairs. The problem with this approach is that I lose the <filename> identifier of the time series data from the files.

  result <- lapply(files, function(x) x <- read.csv(x) )

I understand that I can write code to generate the R statements required to do all these steps instance by instance. I'm wondering if there is some approach that wouldn't require me to do that. It's hard for me to believe that others haven't wanted to perform this same task.

Upvotes: 1

Views: 1451

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269346

Try this:

z <- read.zoo(files, header = TRUE, sep = ",")
z <- na.locf(z)

I have assumed a header line and lines like 2000-01-31,23.40 . Use whatever read.zoo arguments are necessary to accommodate whatever format you have.

Upvotes: 2

agstudy
agstudy

Reputation: 121568

You can have better formatting using sapply( keep the files names). Here I will keep lapply.

  1. Assuming that all your files are in the same directory you can use list.files. it is very handy for such workflow.
  2. I would use read.zoo to get directly zoo objects(avoid later coercing)

For example:

zoo.objs <- lapply(list.files(path=MY_FILES_DIRECTORY,
                              pattern='^zoo_*.csv',    ## I look for csv files, 
                                                       ##   which names start with zoo_
                              full.names=T),           ## to get full names path+filename
                   read.zoo)

I use now list.files again to rename my result

 names(zoo.objs) <- list.files(path=MY_FILES_DIRECTORY,
                          pattern='^zoo_*.csv')

Upvotes: 1

Related Questions