Reputation: 839
for all the files in one directory, I want to read each file into a data frame then process the file, for example, calculate cor across columns. For example:
files<-list.files(path=".") <br>
names <- substr(files,18,20)
for(i in c(1:length(names))){
name <- names[i]
assign (name, read.table(files[i]))
sapply(3:ncol(name), function(y) cor(name[, 2], name[, y], ))
}
but 'name' is a string in the last statement of the code, how can I process the dataframe 'name'?
Upvotes: 1
Views: 3804
Reputation: 272
The way to do this is to put all the files you wish to read in in one folder, and then work with lists:
your.dir <- "" # adjust
files <- list.files(your.dir)
your.dfs <- lapply(file.path(your.dir, files), read.table)
your.dfs
is now a list holding all your data frames. You can perform functions on all data frames simultaneously using lapply, or you can access individual data frames with the usual subsetting syntax, for example your.dfs[[1]]
to access the first data frame.
Upvotes: 0
Reputation: 6535
This is exactly what R's lists are for. Also calling sapply
to get all of the correlations is unnecessary since cor
returns the correlation matrix so you can just subset
R> files <- list.files(pattern = "tsv")
R> dat <- lapply(files, read.table)
R> dat
[[1]]
a b c
1 2.802164 4.835557 6
2 1.680186 4.974198 3
3 3.002777 4.670041 6
4 2.182691 5.137982 11
5 4.206979 5.170269 5
6 1.307195 4.753041 9
7 2.919497 4.657171 7
8 2.938614 5.305558 9
9 2.575200 4.893604 2
10 1.548161 4.871108 4
[[2]]
a b c
1 -1.8483890 2 6
2 -2.9035164 0 7
3 -0.6490283 1 6
4 -2.8842633 3 2
5 -1.8803775 0 12
6 -3.0267870 1 9
7 0.5287124 0 7
8 -3.7220733 0 2
9 -2.0663912 2 9
10 -1.6232248 1 6
You can then lapply
over this list
again to process or do it as a one liner.
R> dat <- lapply(files, function(x) cor(read.table(x))[1,-1] )
R> dat
[[1]]
b c
0.27236143 -0.04973541
[[2]]
b c
-0.1440812 0.2771511
Upvotes: 3