venkatkotta
venkatkotta

Reputation: 11

Scrapy: webpage next button uses WebForm_DoPostBackWithOptions()

I am new to scrapy and trying to scrape https://www.sakan.co/result?srv=1&prov=&cty=&maintyp=1&typ=5&minpr=&maxpr=&bdrm=&blk=

This webpage is using a href with the following:

href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$Content$rptPaging$ctl02$lbPaging", "", true, "", "", false, true))"

Data is getting loaded dynamically. I am trying to find the source (API call if any) for data that is getting loaded but could not find any. How can I navigate to next page and scrape data using Scrapy.

Upvotes: 1

Views: 269

Answers (1)

renatodvc
renatodvc

Reputation: 2564

What this js effectively do is trigger a POST request, you can check the details of the request in the browsers developer tools, network tab. (F12 in Firefox - Open the tab and click the link) enter image description here

Your Scrapy needs to reproduce that same POST request. All the information in the body is available in the page, just keep in mind that those fields that start with __, like __VIEWSTATE, are instance dependent, so you need to retrieve their values from the page your Scrapy loads, copy and paste will usually fail.

The easier way to do this is using the FormRequest.from_response() method. However, its important to check if the method is producing a request body that is the same your browser, quite often the method skips a required field or adds an extra one. (It relies on the page's <form>)

You can read more on scraping this kind of page in this link from Scrapy FAQ.

Finally one last tip: If your request body is the just like the browser, but the request still fails, you might need to reproduce the request headers as well.

Upvotes: 1

Related Questions