Reputation: 4233
I'm trying to web-scrape a page on www.roblox.com that requires me to be logged in. I have done this using the .ROBLOSECURITY cookie, however, that cookie changes every few days. I want to instead log in using the login form and Python. The form and what I have so far is below. I do NOT want to use any add-on libraries like mechanize or requests.
Form:
<form action="/newlogin" id="loginForm" method="post" novalidate="novalidate" _lpchecked="1">
<div id="loginarea" class="divider-bottom" data-is-captcha-on="False">
<div id="leftArea">
<div id="loginPanel">
<table id="logintable">
<tbody>
<tr id="username">
<td><label class="form-label" for="Username">Username:</label></td>
<td><input class="text-box text-box-medium valid" data-val="true" data-val-required="The Username field is required." id="Username" name="Username" type="text" value="" autocomplete="off" aria-required="true" aria-invalid="false" style="cursor: auto; background-image: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGP6zwAAAgcBApocMXEAAAAASUVORK5CYII=);"></td>
</tr>
<tr id="password">
<td><label class="form-label" for="Password">Password:</label></td>
<td><input class="text-box text-box-medium" data-val="true" data-val-required="The Password field is required." id="Password" name="Password" type="password" autocomplete="off" style="cursor: auto; background-image: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGP6zwAAAgcBApocMXEAAAAASUVORK5CYII=);"></td>
</tr>
</tbody>
</table>
<div>
</div>
<div>
<div id="forgotPasswordPanel">
<a class="text-link" href="/Login/ResetPasswordRequest.aspx" target="_blank">Forgot your password?</a>
</div>
<div id="signInButtonPanel" data-use-apiproxy-signin="False" data-sign-on-api-path="https://api.roblox.com/login/v1">
<a roblox-js-onclick="" class="btn-medium btn-neutral">Sign In</a>
<a roblox-js-oncancel="" class="btn-medium btn-negative">Cancel</a>
</div>
<div class="clearFloats">
</div>
</div>
<span id="fb-root">
<div id="SplashPageConnect" class="fbSplashPageConnect">
<a class="facebook-login" href="/Facebook/SignIn?returnTo=/home" ref="form-facebook">
<span class="left"></span>
<span class="middle">Login with Facebook<span>Login with Facebook</span></span>
<span class="right"></span>
</a>
</div>
</span>
</div>
</div>
<div id="rightArea" class="divider-left">
<div id="signUpPanel" class="FrontPageLoginBox">
<p class="text">Not a member?</p>
<h2>Sign Up to Build & Make Friends</h2>
<a roblox-js-onsignup="" class="btn-medium btn-primary">Sign Up</a>
</div>
</div>
</div>
<input id="ReturnUrl" name="ReturnUrl" type="hidden" value="">
</form>
What I have so far:
import cookielib
import urllib
import urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
authentication_url = 'http://www.roblox.com/newlogin'
payload = {
'ReturnUrl' : 'http://www.roblox.com/home',
'Username' : 'usernamehere',
'Password' : 'passwordhere'
}
data = urllib.urlencode(payload)
req = urllib2.Request(authentication_url, data)
resp = urllib2.urlopen(req)
contents = resp.read()
print contents
What is wrong with my code; I only get the log in page when I print contents
PS: The login page is HTTPS
Upvotes: 1
Views: 562
Reputation: 897
I made this class a few weeks ago using just urllib.request for some webscraping/autotab opening. This may help you out or perhaps get you on the right path.
import urllib.request
class Log_in:
def __init__(self, loginURL, username, password):
self.loginURL = loginURL
self.username = username
self.password = password
def log_in_to_site(self):
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm = None,
uri=self.loginURL,
user=self.username,
passwd=self.password)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
Upvotes: 1
Reputation: 38667
Solution from OP.
I finished the script myself with the code below:
import cookielib
import urllib
import urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
authentication_url = 'https://www.roblox.com/newlogin'
payload = {
'username' : 'YourUsernameHere',
'password' : 'YourPasswordHere',
'' : 'Log In',
}
data = urllib.urlencode(payload)
req = urllib2.Request(authentication_url, data)
resp = urllib2.urlopen(req)
PageYouWantToOpen = urllib2.urlopen("http://www.roblox.com/develop").read()
Upvotes: 1