Reputation: 1084
I would like to speed-up the below function (fndf
) which calls another function (fn1
) based on a character array.
fndf
- New Function
list_s
- character array - chr [1:400]
rdata_i
- empty data frame (for initialization)
fn1
- another custom function
rdata2
- data frame with 3000 obs of 40 variables
mdata
- data.frame
nm
- character
fndf = function(list_s, rdata2){
rdata_i = df <- data.frame(Date=as.Date(character()),
File=character(),
User=character(),
stringsAsFactors=FALSE)
for(i in 1:length(list_s))
{
rdata = fn1(list_s[i], rdata2)
rdata_i = rbind(rdata, rdata_i)
}
return(unique(rdata_i))
}
Can we also improve performance of the function
below?
fn1 = function(nm, mdata){
n0 = mdata[mdata$Sign==nm,]
cn0 = unique(c(n0$Name))
repeat{
n1c = mdata[mdata$Mgr %in% cn0,]
n0 = unique(rbind(n0,n1c))
if(nrow(n1c)==0){
return(n0)
break
}
cn0= unique(c(n1c$Name))
}
}
Upvotes: 0
Views: 115
Reputation: 545518
It’s indeed hard to say how to best transform your loop into an *apply
statement, and even harder to say whether this will speed it up. But fundamentally, the following transformation is what you’re after, and it definitely makes the function simpler and more readable. It also quite possibly corresponds to a substantial performance gain due to the loss of the repeated rbind
, as noted by baptiste:
fndf = function (list_s, rdata2)
as.data.frame(do.call(rbind, unique(lapply(list_s, fn1, rdata2))))
(Yes. That’s a single statement.)
Also note that I’m now applying the unique
directly to the list rather than the data.frame
. This changes the semantics – unique
is specialised for data.frame
s – but is probably the right thing for your purposes, and it will be more efficient because it means that we don’t construct a needlessly big data.frame
with redundant rows.
Upvotes: 4
Reputation: 8691
It's hard to say without your data/functions, but here is a solution with plyr
and some placeholder data:
list_s<-LETTERS
rdata2<-data.frame(a=rep(LETTERS,2),b=runif(52),c=runif(52)*10)
fn1<-function(a,b=rdata2)b[rdata2$a==a,]
fn1("A")
require(plyr) # for ldply function, which takes a list and returns a dataframe
result<-ldply(1:length(list_s),function(x)fn1(list_s[x],rdata2))
head(result)
a b c
1 A 0.281940237 2.7774933
2 A 0.023611392 0.6067029
3 B 0.456547803 9.4219258
4 B 0.645783746 5.3094864
5 C 0.475949523 4.8580622
6 C 0.006063407 2.5851738
Upvotes: 1