Reputation: 7605
I'm playing with Scrapy and playing with this tutorial. Things look good but I noticed Steam changed their age check so there is no longer a form in DOM. So the suggested solution will not work:
form = response.css('#agegate_box form')
action = form.xpath('@action').extract_first()
name = form.xpath('input/@name').extract_first()
value = form.xpath('input/@value').extract_first()
formdata = {
name: value,
'ageDay': '1',
'ageMonth': '1',
'ageYear': '1955'
}
yield FormRequest(
url=action,
method='POST',
formdata=formdata,
callback=self.parse_product
)
Checking an example game that forces age check; I noticed the View Page button is no longer a form:
<a class="btnv6_blue_hoverfade btn_medium" href="#" onclick="ViewProductPage()"><span>View Page</span></a>
And the function being called will eventually call this one:
function CheckAgeGateSubmit( callbackFunc )
{
if ( $J('#ageYear').val() == 2019 )
{
ShowAlertDialog( '', 'Please enter a valid date' );
return false;
}
$J.post(
'https://store.steampowered.com/agecheckset/' + "app" + '/9200/',
{
sessionid: g_sessionID,
ageDay: $J('#ageDay').val(),
ageMonth: $J('#ageMonth').val(),
ageYear: $J('#ageYear').val()
}
).done( function( response ) {
switch ( response.success )
{
case 1:
callbackFunc();
break;
case 24:
top.location.reload();
break;
case 15:
case 2:
ShowAlertDialog( 'Error', 'There was a problem verifying your age. Please try again later.' );
break;
}
} );
}
So basically this is making a POST with some data...what would be the best way to do this in Scrapy, since this is not a form any longer? I'm just thinking on ignoring the code where the form is obtained and simply send the request with the FormRequest object...but is this the way to go? An alternative could also be setting cookies for age and pass it on every single request so possibly the age check is ignored altogether?
Thanks!
Upvotes: 0
Views: 179
Reputation: 21436
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
Upvotes: 1