Reputation: 2640
I'm looking to work with the data frames I have within a list. I'm getting 'incorrect number of subscripts' and similar errors, despite my current best efforts. Here's my code:
folder = 'C:/Path to csv files-071813/'
symbs = c('SPX', 'XLF', 'XLY', 'XLV', 'XLI', 'IYZ', 'XLP', 'XLE', 'XLK', 'XLB', 'XLU', 'SHV')
importData = vector('list', length(symbs))
names(importData) = symbs
for (sIdx in 1:length(symbs)){
#Import the data for each symbol into the list.
importData[sIdx] = read.csv(paste(folder, symbs[sIdx], '.csv', sep = ''), header = TRUE)
}
Each csv file is thousands of rows, and 7 columns. I'm assuming what I have above is returning a data frame from each csv file, into my list. I'd like to enter:
importData[[1]][, 1]
to work with the first column of the first data frame in my list. Am I close? I can't find resolution despite all my searching. Many thanks in advance...
Upvotes: 1
Views: 1928
Reputation: 174853
Yes, you are close. You need
importData[[sIdx]] <- read.csv(....)
(i.e. [[
) as you want to assign a data frame inside the sIdx
th component. Single brackets [
would require a list to be assigned.
importData[[1]]
returns the object inside importData[1]
. This is a subtle difference, with the latter returning a list containing the first component, whereas the former returns the object inside that list.
As importData[[sIdx]]
is a data frame, you can index it as you would any other data frame. It might help to think of importData[[sIdx]]
as data frame df
and then add on to that what you would normally use to index the first column, i.e. df[, 1]
(or alternatively df[[1]]
), then substitute back in the real object instead of df
df[, 1]
importData[[sIdx]][, 1] ## substitute back in the real object for `df`
If you want to extract each first column in turn, then
lapply(importData, `[`, , 1) ## matches df[, 1]
or
lapply(importData, `[[`, 1) ## matches df[[1]]
will return them as a list, with versions using sapply()
instead of lapply()
simplifying the result to an array where possible.
Note that in the first example
lapply(importData, `[`, , 1)
the empty argument (, , 1
) is important as it refers to the empty argument in df[ , 1]
, i.e. the bit before the comma. Hence the second option, using [[
in the lapply()
call may be less error-prone and why I mentioned it earlier.
Upvotes: 1
Reputation: 242
> myfunc<-function(a,b){ ###a is numeric (vector of) symbol indices to
> include,b is (vector of) column indices to include
> if (length(a)>0){
> importalldata<-read.csv(paste(folder, symbs[a[1]], '.csv', sep = ''), header = TRUE)[b]
> if (length(a)>1){
> for(i in 2:length(d)){
> importalldata<-rbind(importalldata,read.csv(paste(folder, symbs[a[i]], '.csv', sep = ''), header = TRUE)[b])
> }
> }else{print('Must select at least one symbol')}
> return(importalldata)
> }
to load your data for one symbol, do:
importalldata<-myfunc(1,1)
for multiple symbols:
importalldata<-myfunc(c(1,3,4),1)
for multiple columns:
importalldata<-myfunc(c(1,3,4),1:3)
I think that is what you want? Or are you trying to get all column 1's for each file into 1 dataframe? If you include reproducible data, you will get a better answer.
That said, thousands of rows isn't much and you will would likely be better off combining ('stacking') your data into 1 csv with your symbls as a factors, and then using subset/data.table package to select the data you want. Check out
?stack
Upvotes: 0
Reputation: 568
The apply
family of functions are going to be your friend here, specifically lapply
, a function which, given a list and a function, applies the function to every element of that list and returns the results as elements of a new list.
folder = 'C:/Path to csv files-071813/'
symbs = c('SPX', 'XLF', 'XLY', 'XLV', 'XLI', 'IYZ', 'XLP', 'XLE', 'XLK', 'XLB', 'XLU', 'SHV')
filenames = paste0(folder,symbs,'.csv')
listOfDataframes=lapply(filenames,read.table,header=T)
Now if you want the second column from all the dataframes you could do something like
listOfFirstCols=lapply(listOfDataframes,"[",,2)
Or more explicitly
listOfFirstCols=lapply(listOfDataframes,function(x)x[,1])
Upvotes: 4