Dikonstans
Dikonstans

Reputation: 47

Check if a list of urls exist

Having a dataframe with urls like this:

df<- data.frame('urls' = c('https://www.deakin.edu.au/current-students/unitguides/UnitGuide.php?year=2015&semester=TRI-1&unit=SLE010', 
                                            'https://www.deakin.edu.au/current-students/unitguides/UnitGuide.php?year=2015&semester=TRI-2&unit=HMM202',
                                            'https://www.deakin.edu.au/current-students/unitguides/UnitGuide.php?year=2015&semester=TRI-2&unit=SLE339'))

I try to create a list to check if every url exist or not. I try to produce a dataframe which will have 2 columns. The first is the urls and the second TRUE or FALSE if exist the url or not.

I use this code to make it

library(RCurl)  
df_exist <- data.frame()
for (i in 1:nrow(df)) {
    url <- df$urls[i]
    exist <- url.exists(url)
    df_exist <- rbind(df_exist, data.frame( url = url,
                                         exist = exist))
}

But it gives me this error:

R Session Absorted
R encounterd a fatal error
The session was terminated

I can't understand what I am making wrong in the code to fix it.

Upvotes: 0

Views: 1166

Answers (2)

CCD
CCD

Reputation: 610

Looks to me like RCurl doesn't love that your URLs are factors. I didn't have an issue when I converted them to characters.

library(RCurl)  
df_exist <- data.frame()
for (i in 1:nrow(df)) {
    url <- as.character(df$urls[i])
    exist <- url.exists(url)
    df_exist <- rbind(df_exist, data.frame( url = url,
                                    exist = exist))
}

Also, no need to write that for loop. Read up on the apply family of functions. Something like sapply(df$urls, function(x) url.exists(as.character(x))) Should get you the same result.

Upvotes: 5

TSRTSR
TSRTSR

Reputation: 53

I had the same problem, I did something like:

url_exists <- function(x) url.exists(as.character(x))
df_exist <- mutate(df, exist = sapply(urls, url_exists))

Upvotes: 1

Related Questions