Maximizing speed of a loop/apply function

Question

I am quite struggling with a huge data set at the moment. What I would like to do is not very complicated, but the matter is that it is just too slow. In the first step, I need to check whether a website is active or not. For this intention, I used the following code (here with a sample of three API-pathes)

library(httr)

Updated <- function(x){http_error(GET(x))}  
websites <- data.frame(c("https://api.crunchbase.com/v3.1/organizations/designpitara","www.twitter.com","www.sportschau.de"))
abc <- apply(websites,1,Updated)

I already noticed that a for loop is pretty much faster than the apply function. However, the full code (which has around 1MIllion APIs to check) still would take around 55 hours to be executed. Any help is appreciated :)

Feakster · Accepted Answer

Alternatively, something like this would work for passing multiple libraries to the PSOCK cluster:

clusterEvalQ(cl, {
     library(data.table)
     library(survival)
})

Maximizing speed of a loop/apply function

Answers (2)

Related Questions