Reputation: 45
I'm very new to R and just wrote this to obtain the mean for a number of timeseries in one file:
compiled<-read.table("/Users/Desktop/A/1.txt", header=TRUE)
z<-ncol(compiled)
comp_df<-data.frame(compiled[,2:z])
indmean<- rowMeans(comp_df)
The data in each file looks something like this:
Time A1 A2 A3 A4 A5
1 0.1 0.2 0.1 0.2 0.3
2 0.2 0.3 0.4 0.2 0.3
...
It works fine but I am hoping to apply this to many files of the same nature, with varying numbers of timeseries in each file. If anyone could advise on how I can improve the above to do so, it would be great. Thank you in advance!
Upvotes: 0
Views: 827
Reputation: 69201
You can steps you've outlined above - roll them up into a function, and the lapply
them over a vector that contains the names of the files you want to do this analysis on. Depending on what you need to do, splitting the reading of the data in from the subsequent analysis may or may not make sense so that you can keep the data in your working environment. For the sake of simplicity, I'm going to assume you don't need the data afterwords.
The general steps will be:
1) Create a vector of your files to be processed. Something like:
filesToProcess <- dir(pattern = "yourPatternHere")
2) Turn your code above into a function
FUN <- function(dat){
compiled<-read.table(dat, header=TRUE)
z<-ncol(compiled)
comp_df<-data.frame(compiled[,2:z])
indmean<- rowMeans(comp_df)
return(indmean)
}
3) lapply
the FUNction to your list of files and assign a new variable:
out <- lapply(filesToProcess, FUN)
4) Give out
some names so you know what goes to what:
names(out) <- filesToProcess
You now have a named list that contains the rowMeans for all files you listed in filesToProcess
.
Upvotes: 3