Means of timeseries in R

Question

I'm very new to R and just wrote this to obtain the mean for a number of timeseries in one file:

compiled<-read.table("/Users/Desktop/A/1.txt", header=TRUE)

z<-ncol(compiled)

comp_df<-data.frame(compiled[,2:z])

indmean<- rowMeans(comp_df)

The data in each file looks something like this:

Time A1 A2 A3 A4 A5

1 0.1 0.2 0.1 0.2 0.3


2 0.2 0.3 0.4 0.2 0.3

...

It works fine but I am hoping to apply this to many files of the same nature, with varying numbers of timeseries in each file. If anyone could advise on how I can improve the above to do so, it would be great. Thank you in advance!

Chase · Accepted Answer

You can steps you've outlined above - roll them up into a function, and the lapply them over a vector that contains the names of the files you want to do this analysis on. Depending on what you need to do, splitting the reading of the data in from the subsequent analysis may or may not make sense so that you can keep the data in your working environment. For the sake of simplicity, I'm going to assume you don't need the data afterwords.

The general steps will be:

1) Create a vector of your files to be processed. Something like:

filesToProcess <- dir(pattern = "yourPatternHere")

2) Turn your code above into a function

FUN <- function(dat){   
  compiled<-read.table(dat, header=TRUE)
  z<-ncol(compiled)
  comp_df<-data.frame(compiled[,2:z])
  indmean<- rowMeans(comp_df)
  return(indmean)
}

3) lapply the FUNction to your list of files and assign a new variable:

out <- lapply(filesToProcess, FUN)

4) Give out some names so you know what goes to what:

names(out) <- filesToProcess

You now have a named list that contains the rowMeans for all files you listed in filesToProcess.

Means of timeseries in R

Answers (1)

Related Questions