Reputation: 51
I'm using rvest
to scrape a website. I'm totally OK with other websites but this one is using another type of certificate I think. I've seen similar questions here and in github but no one could help me.
Any help is appreciated.
My script is as follows:
url <- "https://search.codal.ir/api/search/v1/q?PageNumber=1&Symbol=%D9%81%D8%B3%D8%A7&Subject=%20&CompanyState=0&LetterType=6&TracingNo=-1&LetterCode=%20&FromDate=1395/01/01&ToDate=%DB%B1%DB%B3%DB%B9%DB%B8/%DB%B0%DB%B3/%DB%B1%DB%B6&AuditorRef=-1&YearEndToDate=&Publisher=false&Mains=true&Childs=false&Audited=false&NotAudited=true&Length=-1&Consolidatable=true&NotConsolidatable=true&CompanyType=1&Category=1"
data <- fromJSON(url)[[3]]
and the error is:
"Error in open.connection(con, "rb") : SSL certificate problem: unable to get local issuer certificate"
Upvotes: 3
Views: 4692
Reputation: 667
You can try :
library(httr)
set_config(config(ssl_verifypeer = 0L))
Upvotes: 0
Reputation: 58034
This is a wrongly configured server (search.codal.ir). A friendly email to their admins could probably be considered.
The problem here is that this TLS server doesn't send a complete cert chain in the handshake, which it should according to standards. More specifically, it doesn't send the intermediate certificate. This is visible by posting "search.codal.ir" into SSL Labs test page and the results are clear.
Intermediate certificates are certs that are sort of "in between" the root certificate (that exists in the CA store) and the server's own certificate.
This sometimes work better in browsers than with curl because
The curl error message unable to get local issuer certificate
almost always means this is what happened.
The real and proper fix should be done by the server admins. This is a server setup problem.
You can download the intermediate cert manually and put it in your CA store, the PEM file you tell curl (or other client) to use when verifying the peer.
The SSL Labs page says the following about the missing intermediate cert:
Certum Organization Validation CA SHA2
Fingerprint SHA256: fd02362244f31266caff005818d1004ec4eb08fb239aafaaafff47497d6005d6
Pin SHA256: 51GveKNrpJjtGpXY5QDx03s3YTQwaQic6dWBqo3zX6s=
RSA 2048 bits (e 65537) / SHA256withRSA
(I couldn't find where to download it from)
You can disable certificate verification completely which then will allow your program to continue. But you've then given up on all security and there are just sadness and tears going down that road. Only do that for experiments, never for production.
Upvotes: 5