user1592380
user1592380

Reputation: 36347

Getting javascript postback parameters with scrapy

I have the following element:

<a class="html-attribute-value html-external-link" target="_blank" href='javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ResultsGrid$15", "", false, "", "webproperty.aspx?s=id&amp;s=15&amp;time=201606080118012&amp", false, true))'>javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&amp;quot;ucResultsGrid$R000000015&amp;quot;, &amp;quot;&amp;quot;, false, &amp;quot;&amp;quot;, &amp;quot;webprop.aspx?s=id&amp;amp;sr=15&amp;amp;time=201606080118012&amp;amp;id=15&amp;quot;, false, true))</a>

I want to get the javascript parameters to try to reconstruct the request produced by clicking on a link. I've found that:

response.selector.xpath('//*[@id="ResultsGrid$15"]/@href').extract()

Out[20]: [u'javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ResultsGrid$15", "", false, "", "webproperty.aspx?s=id&sdata=15&time=201606080037034&id=15", false, true))']

This looks good and decodes the javascript parameters. How do I grab them from there?

Upvotes: 2

Views: 244

Answers (1)

alecxe
alecxe

Reputation: 474191

You can do it via re() method and multiple capturing groups:

response.selector.xpath('//*[@id="ResultsGrid$15"]/@href').re(r'javascript:WebForm_DoPostBackWithOptions\(new WebForm_PostBackOptions\("(.*?)", "(.*?)", (.*?), "(.*?)", "(.*?)", (.*?), (.*?)\)\)')

Here I'm using a quite broad .*? - non-greedy match for any characters any number of times, but you can be more strict about what characters to match in what parameter.

Upvotes: 3

Related Questions