Alexander
Alexander

Reputation: 4655

lapply and dplyr combination to process nested data frames

I have a list of dataframes inside of my folder directory which I want to process for analyses. I read them by using inside of lapply function first, then I want to process its columns and order its rows by grouping. Therefore most of times I needed to combine dplyr and lapply functions to process faster of my data. I looked through out the web and check some books but most of the examples are easy ones and do not cover combination of these two functions.

Here is the sample code which I'm using:

files <- mixedsort(dir(pattern = "*.txt",full.names = FALSE)) # to read data

data <-  lapply(files,function(x){
tmp <- read.table(file=x, fill=T, sep = "\t", dec=".", header=F,stringsAsFactors=F)
df <- tmp [!grepl(c("AC"),tmp $V1),]
new.df <- select(df, V1:V26)
new.df <- apply(new.df, function(x){ x[11:26] <- x[11:26]/10000;x })

I am getting the following error:

Error in match.fun(FUN) : argument "FUN" is missing, with no default

Here is the reproducible example which looks like my data. Lets say I want to process 2nd and 3rd column of my dat and group by let column. When I try to put below fun command inside of data code above I got error. Any guidance will be appreciated.

dat <- lapply(1:3, function(x)data.frame(let=sample(letters,4),a=sort(runif(20,0,10000),decreasing=TRUE), b=sort(runif(20,0,10000),decreasing=TRUE), c=rnorm(20),d=rnorm(20)))

fun <- lapply(dat, function(x){x[2:3] <-x[2:3] /10000; x})

Upvotes: 1

Views: 4559

Answers (1)

MarkusN
MarkusN

Reputation: 3223

as mentioned in the comments to your question, the apply function was causing the error. However I don't think apply is what you want, because it aggregates your dataframe.

using just dplyr-syntax your problem can be solved like this:

tmp %>%
  filter(!grepl("AC",V1)) %>%
  select(V1:V26) %>%
  mutate_each(funs(./1000), V11:V26)

Upvotes: 3

Related Questions