Mojsa
Mojsa

Reputation: 27

scrapy booking with playwright-python return an error

I'm using scrapy and playwright to scrape booking.com in this way I need to click on a button and get ajax response. but when I run my code it returns error :

TypeError: Page.locator() missing 1 required positional argument: 'selector'

this is my code:

import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect


class BookingSpider(scrapy.Spider):
name='booking'
start_urls=["https://www.booking.com/hotel/it/hotelnordroma.en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]

def start_requests(self):
    yield scrapy.Request(self.start_urls[0], meta={
        "playwright": True,
        "playwright_include_page":True,
        "playwright_page_methods":[
            PageMethod("wait_for_selector",".e1793b8db2")
        ]
        }) 
    
def parse(self,response):         
    Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()        
    with open("copy.txt", "w", encoding="utf-8") as file:
        file.write((response.text))         
    
 process=CrawlerProcess()
 process.crawl(BookingSpider)
 process.start()

error Message:

File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
                     ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
        current.result, *args, **kwargs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
    return self.parse(response, **kwargs)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
    Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'

Upvotes: -1

Views: 53

Answers (1)

Issues:

Incorrect start_urls usage in start_requests

start_urls is a class attribute, and in start_requests, you should reference self.start_urls. Incorrect use of Page.locator

Page is not defined in your parse function. You need to extract the page from the meta field in response. Incorrect indentation for CrawlerProcess

process = CrawlerProcess() and related lines should not be inside the class. Missing imports

You need to import scrapy, CrawlerProcess, and PageMethod from playwright.

Upvotes: -1

Related Questions