user1633298
user1633298

Reputation: 21

Scrapy Error : Missing scheme in request url

I am facing issues with some urls while running scrappy

    ValueError: Missing scheme in request url: mailto:?body=https%3A%2F%2Fiview.abc.net.au%2Fshow%2Finsiders
[scrapy.core.scraper:168|ERROR] Spider error processing <GET https://iview.abc.net.au/show/four-corners/series/2020/video/NC2003H028S00> (referer: None)

Here are my settings:

"base_urls" : [
    {
      # Start crawling from 
      "url": "https://www.abc.net.au/",

      # Overwrite the default crawler and use th RecursiveCrawler instead
      "crawler": "RecursiveCrawler",

This works ok with following setting

"base_urls" : [
    {
      # Start crawling from 
      "url": "https://www.afr.com/",

      # Overwrite the default crawler and use th RecursiveCrawler instead
      "crawler": "RecursiveCrawler",

Not sure what I am missing here

Upvotes: 0

Views: 393

Answers (1)

renatodvc
renatodvc

Reputation: 2564

You have different behaviors because of the content beign scraped. The problem is that at some point your spider is trying to yield a Request for this URL:

mailto:?body=https%3A%2F%2Fiview.abc.net.au%2Fshow%2Finsiders

The correct URL is probably this:

https://iview.abc.net.au/show/insiders

It's possible that you are scraping the wrong field, or that there was a mistake in the site where this "url" is retrieved.

Upvotes: 1

Related Questions