S31
S31

Reputation: 934

Error: Port 443 Timed Out - Scraping Data

Trying to scrape some data, but I keep getting a timed out error. My internet is working fine, and I updated to the newest R version as well - at a loss of how to approach this at this point. Keeps happening with any URL I try.

library(RCurl)
library(XML)

url = "https://inciweb.nwcg.gov/"
content <- getURLContent(url)
     Error in function (type, msg, asError = TRUE)  : 
       Failed to connect to inciweb.nwcg.gov port 443: Timed out

Upvotes: 0

Views: 3310

Answers (1)

hrbrmstr
hrbrmstr

Reputation: 78792

You may need to set an explicit timeout on slower connections:

library(httr)
library(rvest)

pg <- GET("https://inciweb.nwcg.gov/", timeout(60))

incidents <- html_table(content(pg))[[1]]

str(incidents)
## 'data.frame': 10 obs. of  7 variables:
##  $ Incident: chr  "Highline Fire" "Cottonwood Fire" "Rattlesnake Point Fire" "Coolwater Complex" ...
##  $ Type    : chr  "Wildfire" "Wildfire" "Wildfire" "Wildfire" ...
##  $ Unit    : chr  "Payette National Forest" "Elko District Office" "Nez Perce - Clearwater National Forests" "Nez Perce - Clearwater National Forests" ...
##  $ State   : chr  "Idaho, USA" "Nevada, USA" "Idaho, USA" "Idaho, USA" ...
##  $ Status  : chr  "Active" "Active" "Active" "Active" ...
##  $ Acres   : chr  "83,630" "1,500" "4,843" "2,969" ...
##  $ Updated : chr  "1 min. ago" "1 min. ago" "3 min. ago" "5 min. ago" ...

Temporary Workaround

l <- charToRaw(paste0(readLines("https://inciweb.nwcg.gov/"), collapse="\n"))

pg <- read_html(l)

html_table(pg)[[1]]

Upvotes: 2

Related Questions