Reputation: 344
I have the following function to get some urls from a website using RSelinium and phantomjs.
get_url <- function(url){
rdr$navigate(url)
li <- rdr$findElements(using = 'xpath', "//div[@data-id]")
str <- sapply(li, function(x){x$getElementAttribute('outerHTML')})
if(length(str)>1){
tree <- htmlParse(str)
url <- getNodeSet(tree, '//div//a[@class="link url"]')
url <- sapply(url, xmlGetAttr, 'href')
}
}
And the url
is stored in a 30 x 60 matrix.
I tried doing this using the following nested loop.
for(i in 1:ncol(offset_url)){
for(j in 1:nrow(offset_url)){
url_list <- rbind(url_list,get_url(offset_url[j,i]))
}
}
However, it takes a lot of time to execute.
Is there a way that I can use apply functions to rduce the time?
Upvotes: 0
Views: 705
Reputation: 3587
Is this helpful?
do.call(rbind,list(mapply(function(x,y) get_url(offset_url[x,y]),x=row(offset_url),y=col(offset_url))))
Upvotes: 1