Reputation: 2057
So I come from a background of Matlab and Python (and several others less related). I'm picking up R for a Coursera course.
I followed this SO answer in order to read in all my homework files into a list in a single line of code. My code looks like this:
# Get a list of files
files = list.files(path = dataDir, pattern = '*.csv')
# Import the file data
setwd(dataDir)
data = lapply(files, read.csv)
This all works just fine. However, I am getting a object back that I don't know how to access. I mentioned Matlab and Python before because I've attempted to access the data in all the ways I would in those languages.
Here's what summary output:
summary(data)
Length Class Mode
[1,] 4 data.frame list
[2,] 4 data.frame list
[3,] 4 data.frame list
There are actually 352 of them not just 3 but no one needs a listing of all 352. Here's what summary
of an individual index outputs:
summary(data[200])
Length Class Mode
[1,] 4 data.frame list
So if I enter data[200]
I get listing of the first 2500 rows of data. But data[200, 100]
returns as error as does data[200][,100]
and data[200][100,]
. data[200][100]
returns [[1]] NULL
.
While I haven't fully considered what I will need to do for this homework I'm sure it will involve calculating means/medians/maximum/etc of all non-NA values in various data columns. This wasn't tough to do for the quizzes using something like mean(data[which(is.na('Col1')==F), 'Col6'])
.
So I imagine I could use a more hackish version of what I need where I simply load the 1 file I need at the time I need it, extract only the portion of the data frame I need right then, and loop over all the data files I need to process. However, I'd rather know how to access the data in the object R creates from the lapply
line. I suspect this will make more complex analyses later on much easier to code.
Thanks
Upvotes: 0
Views: 1930
Reputation: 7796
When you subset, single square brackets [
always return an object of the same class as the object you are subsetting. So, data[200]
returns a list
of length 1 containing one dataframe because data
is a list. Double square brackets [[
give you the actual object contained in the list (in this case, a dataframe). Once you have a dataframe, you can select the first 100 rows with [100,]
, which is why the following works:
data[[200]][100,]
Upvotes: 3