Soph2010
Soph2010

Reputation: 613

Using tryCatch to replace urls and get final url from website in R

I have a dataframe with a column "URLs" that contains 23k website url redirects. I want to get the final url from these redirects and store them in a new column. However, some of the original urls are not valid anymore and lead to an error, so that I want to try the code with tryCatch. But since I am still a beginner in R, I am not sure how to correctly state this.

I used dput on my "URLs" column for the first couple of rows and edited one url in, that is incorrect:

c("https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https%3A//sirinlabs.com%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1036136?to=https%3A//dash2trade.com%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1035284?to=https%3A//impt.io%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1030235?to=https%3A//calvaria.io%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1011041?to=https%3A//artyfact.art%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1031430?to=https%3A//www.projectnexus.app%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1005962?to=https%3A//seedon.io%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1033498?to=https%3A//vicuna.network%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1036409?to=https%3A//cryptoffer.io/%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/23905?to=http%3A//www.bitcoin.org/%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1450?to=https%3A//ethereum.org%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/17581?to=https%3A//telegram.org%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/1009688?to=https%3A//egoco.in/%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/19163?to=https%3A//lapo.io%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/20971?to=https%3A//ingotcoin.io%3Futm_source%3Dicoholder", 
"https://icoholder.com/en/v2/ico/ico-redirect/26401?to=https%3A//restotoken.org%3Futm_source%3Dicoholder",
"https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https%3A//ccc"
)

and the code I am playing around with currently looks like this:

library(httr)

df$URLs <- tryCatch(sapply(df$URLs, function(x) GET(x)$url), error = function(e) return(NULL))

I have seen questions like this: How to write trycatch in R explaining how to use tryCatch, however, I am not sure how to adapt it to my specific case. Would be grateful for any tips and code adaptations!!!

Upvotes: 0

Views: 57

Answers (1)

HoelR
HoelR

Reputation: 6583

Instead of tryCatch(), I used possibly() that comes with purrr and pretty much does the same thing. If the function throws an error it will replace it with NA

library(tidyverse) 
library(httr)

df %>%
  mutate(final_url = map_chr(
    links,
    possibly( ~ .x %>% 
                GET() %>% 
                pluck("url"), 
              otherwise = NA_character_)
  ))

# A tibble: 17 x 2
   links                                                           final~1
   <chr>                                                           <chr>  
 1 https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https%3A/~ https:~
 2 https://icoholder.com/en/v2/ico/ico-redirect/1036136?to=https%~ https:~
 3 https://icoholder.com/en/v2/ico/ico-redirect/1035284?to=https%~ https:~
 4 https://icoholder.com/en/v2/ico/ico-redirect/1030235?to=https%~ https:~
 5 https://icoholder.com/en/v2/ico/ico-redirect/1011041?to=https%~ https:~
 6 https://icoholder.com/en/v2/ico/ico-redirect/1031430?to=https%~ https:~
 7 https://icoholder.com/en/v2/ico/ico-redirect/1005962?to=https%~ https:~
 8 https://icoholder.com/en/v2/ico/ico-redirect/1033498?to=https%~ https:~
 9 https://icoholder.com/en/v2/ico/ico-redirect/1036409?to=https%~ https:~
10 https://icoholder.com/en/v2/ico/ico-redirect/23905?to=http%3A/~ https:~
11 https://icoholder.com/en/v2/ico/ico-redirect/1450?to=https%3A/~ https:~
12 https://icoholder.com/en/v2/ico/ico-redirect/17581?to=https%3A~ https:~
13 https://icoholder.com/en/v2/ico/ico-redirect/1009688?to=https%~ https:~
14 https://icoholder.com/en/v2/ico/ico-redirect/19163?to=https%3A~ https:~
15 https://icoholder.com/en/v2/ico/ico-redirect/20971?to=https%3A~ https:~
16 https://icoholder.com/en/v2/ico/ico-redirect/26401?to=https%3A~ http:/~
17 https://icoholder.com/en/v2/ico/ico-redirect/4321?to=https%3A/~ NA     
# ... with abbreviated variable name 1: final_url

Upvotes: 2

Related Questions