Reputation: 23
I have a set of excel files each containing one sheet of data, all of similar structure (mostly -- see below), that I want to ultimately combine into one large data frame (with each sub-set indexed by original file source). I am able to create a list of multiple dataframes, and then merge these into one dataframe, pretty easily with the following code:
files <- grep(".xlsx", dir(), value=TRUE) # vector of file names
IDnos <- substr(files,20,24) #vector with key 5-digit ID info of each file
library("XLConnect")
library("data.table")
datalist <- lapply(files, readWorksheetFromFile, sheet = "Data")
names(datalist) <- IDnos
bigdatatable <- rbindlist(datalist, idcol = "IDNo")
One data column "Value" is usually class numeric, except I found that in several there was an "ND" put in to one row, making it class character, so in the final data frame the column is character.
Although I can fix this with some simple cleaning, I was left wondering if there is way to identify at the "list of dataframes" stage which files (or dataframe components of the list I created) with class character for column "Value". For example I can't run sapply(datalist,class) or other variations. I am hoping to avoid a for-loop.
Is there any way to use lapply or sapply to drill down into dataframes within a list?
Upvotes: 1
Views: 543
Reputation: 33782
Here's how I would use lapply
to find the class of column a
in a list of 2 data frames, named x
and y
.
datalist <- list(x = data.frame(a = letters),
y = data.frame(a = 1:26))
lapply(datalist, function(x) class(x$a))
$x
[1] "factor"
$y
[1] "integer"
Upvotes: 1