How can I navigate and extract restaurant details from zaubee.com when the href attribute is set to "#" for each restaurant link?

Question

How can I scrape the zaubee.com website to extract business details from each restaurant's page when the href attribute is set to "#" in scrapy??

I'm presently working on a web scraping project that will gather company information from the zaubee.com website. However, the href parameter for each restaurant link is set to #, preventing me from visiting the various restaurant sites and gathering the needed data.

import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule


class zaubeeSpider(scrapy.Spider):
    name = 'zaubeeerestaurant'
    allowed_domains = ['www.zaubee.com']
    start_urls = ['https://zaubee.com/category/restaurant-in-fredonia-hclq6jom']

def parse(self, response):
    restaurantlink = response.xpath("//div[@class='search-result__title-wrapper']/h2")
    for restaurant in restaurantlink:
        name= restaurant.xpath(".//text()").get()
        link = restaurant.xpath(".//@href").get()
        yield {
            'name':name,
            'link':link
        }
        yield response.follow(url=link,callback =self.parse_restaurant)


def parse_restaurant(self,response):
    name = response.xpath("//h1[@class='postcard__title postcard__title--claimed']/text()").get()
    website = response.xpath("(//a[@class='profile__website__link']/@href)[1]").get()
    address = response.xpath("(//address[@class='profile__address--compact']/text())[1]").get()

    yield{
        'name':name,
        "website":website,
        'address':address
    }

I've previously created a scraping solution using Scrapy, but I need help overcoming this challenge. What method or workaround can I use to visit each restaurant's page and get the necessary information?

OUTPUT FOR ONE ENTRY:

2023-06-04 23:38:10 [scrapy.core.scraper] DEBUG: Scraped from <200 [https://zaubee.com/category/restaurant-in-fredonia-hclq6jom](https://zaubee.com/category/restaurant-in-fredonia-hclq6jom)>
{'name': 'Restaurants in Fredonia New York', 'link': '#'}

When it try to get inside link shown below

2023-06-04 23:38:12 [scrapy.core.scraper] DEBUG: Scraped from <200 [https://zaubee.com/category/restaurant-in-fredonia-hclq6jom](https://zaubee.com/category/restaurant-in-fredonia-hclq6jom)>
{'name': None, 'website': None, 'address': None}

I'm trying to get inside each restaurant link and collect restaurant name, address, telephone, timing for particular link.

How can I navigate and extract restaurant details from zaubee.com when the href attribute is set to "#" for each restaurant link?

Answers (1)

Related Questions

How can I navigate and extract restaurant details from zaubee.com when the href attribute is set to &quot;#&quot; for each restaurant link?

Answers (1)

Related Questions

How can I navigate and extract restaurant details from zaubee.com when the href attribute is set to "#" for each restaurant link?