Reputation: 1401
I have this script:
import requests
from bs4 import BeautifulSoup
with requests.Session() as c:
body = {'username':'*****','password':'*********','submit':'Log In','mod':'www','ssl':'0','dest':'community'}
con = c.post('https://secure.runescape.com/m=weblogin/login.ws', data=body)
a = (con.cookies['session'])
cookies = dict(session=a)
b = c.get('https://www.runescape.com/c=Xbn439ejpJo/account_settings.ws?jptg=ia&jptv=navbar',cookies=cookies)
With the first link I manage to login. When I try to reach the second page, I am not logged in... why? print (b.text)
Upvotes: 0
Views: 144
Reputation: 44
This depends on the login tokens of the website. Runescape is a popular game, and so they may have dozens of method of preventing scraping like what you seem to be trying to do.
Other than that, the normal method that logins use is to use a session id - which is sent in the header of every request. You've already added the cookie for this - so this doesn't seem to be the problem.
The way you can debug this is to open a browser and see the raw request it sends using the debugging tool (chrome and firefox have one). And mimic the request you see there i python.
Would be good to set the "origin" header to Chrome or similar so that Runescape doesnt auto detect that it's being scraped.
Note : Always check the ROBOT.txt and don't go against their policies while doing this. They can easily ban your IPs and accounts if you do that.
Upvotes: 2