Reputation: 27
I'm using scrapy
and playwright
to scrape booking.com
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=["https://www.booking.com/hotel/it/hotelnordroma.en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
Upvotes: -1
Views: 53
Reputation: 1
Issues:
Incorrect start_urls usage in start_requests
start_urls is a class attribute, and in start_requests, you should reference self.start_urls. Incorrect use of Page.locator
Page is not defined in your parse function. You need to extract the page from the meta field in response. Incorrect indentation for CrawlerProcess
process = CrawlerProcess() and related lines should not be inside the class. Missing imports
You need to import scrapy, CrawlerProcess, and PageMethod from playwright.
Upvotes: -1