Reputation: 3891
New to scrapy and wondering if anyone can point me to a sample project using scrapy to submit to HTML forms that have hidden fields in cases where the action page of the form is not the same address as where the form itself is presented.
What is the easiest way to do this in Scrapy? I can see that you could write two spiders - one first to get the html with the form and pick out all the hidden fields and then a second one to use the info with the hidden fields to submit the form.
I am wondering if there is a 1-step process for this instead (the Scrapy request documentation seems to assume it's all on the same page when it says using FormRequest.from_response will take care of hidden fields). If so, can someone tell me where I can find the steps of the 1 step process?
Upvotes: 1
Views: 405
Reputation: 2594
FormRequest
extends the Request
object. So you can get the formdata
inclusive the hidden values with FormRequest.from_response
and, if needed, change the url
after that.
Demo Pseudo Code:
class ExampleSpider(scrapy.Spider):
name = 'example.com'
start_urls = ['http://www.example.com/FormPage.php']
def parse(self, response):
request = scrapy.FormRequest.from_response(
response,
callback=self.parse_response_from_Form
)
request.replace(url='http://www.other-site.com/')
return request
def parse_response_from_Form(self, response):
# go on here...
pass
Upvotes: 1