Reputation: 11
I am new to scrapy and trying to scrape https://www.sakan.co/result?srv=1&prov=&cty=&maintyp=1&typ=5&minpr=&maxpr=&bdrm=&blk=
This webpage is using a href
with the following:
href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$Content$rptPaging$ctl02$lbPaging", "", true, "", "", false, true))"
Data is getting loaded dynamically. I am trying to find the source (API call if any) for data that is getting loaded but could not find any. How can I navigate to next page and scrape data using Scrapy.
Upvotes: 1
Views: 269
Reputation: 2564
What this js effectively do is trigger a POST request, you can check the details of the request in the browsers developer tools, network tab. (F12 in Firefox - Open the tab and click the link)
Your Scrapy needs to reproduce that same POST request. All the information in the body is available in the page, just keep in mind that those fields that start with __
, like __VIEWSTATE
, are instance dependent, so you need to retrieve their values from the page your Scrapy loads, copy and paste will usually fail.
The easier way to do this is using the FormRequest.from_response()
method. However, its important to check if the method is producing a request body that is the same your browser, quite often the method skips a required field or adds an extra one. (It relies on the page's <form>
)
You can read more on scraping this kind of page in this link from Scrapy FAQ.
Finally one last tip: If your request body is the just like the browser, but the request still fails, you might need to reproduce the request headers as well.
Upvotes: 1