Reputation: 259
I am trying to log into netflix with python, would work perfectly but i cant get it to detect weather or not login failed, the code looks like this:
#this is not purely my code! Thanks to Ori for the code
import urllib
username = raw_input('Enter your email: ')
password = raw_input('Enter your password: ')
params = urllib.urlencode(
{'email': username,
'password': password })
f = urllib.urlopen("https://signup.netflix.com/Login", params)
if "The login information you entered does not match an account in our records. Remember, your email address is not case-sensitive, but passwords are." in f.read():
success = False
print "Either your username or password was incorrect."
else:
success = True
print "You are now logged into netflix as", username
raw_input('Press enter to exit the program')
As always, many thanks!!
Upvotes: 1
Views: 4926
Reputation: 43077
First, I'll just share some verbiage I noticed on the Netflix site under Limitations on Use:
Any unauthorized use of the Netflix service or its contents will terminate the limited license granted by us and will result in the cancellation of your membership.
In short, I'm not sure what your script does after this, but some activities could jeopardize your relationship with Netflix. I did not read the whole ToS, but you should.
That said, there are plenty of legitimate reasons to scrape html information, and I do it all the time. So my first bet with this specific problem is you're using the wrong detection string... Just send a bogus email/password and print the response... Perhaps you made an assumption about what it looks like when you log in with a browser, but the browser is sending info that gets further into the process.
I wish I could offer specifics on what to do next, but I would rather not risk my relationship with 'flix to give a better answer to the question... so I'll just share a few observations I gleaned from scraping oodles of other websites that made it kindof hard to use web robots...
First, login to your account with Firefox, and be sure to have the Live HTTP Headers add-on enabled and in capture mode... what you will see when you login live is invaluable to your scripting efforts... for instance, this was from a session while I logged in...
POST /Login HTTP/1.1 Host: signup.netflix.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: https://signup.netflix.com/Login?country=1&rdirfdc=true --->Insert lots of private stuff here Content-Type: application/x-www-form-urlencoded Content-Length: 168 authURL=sOmELoNgTeXtStRiNg&nextpage=&SubmitButton=true&country=1&email=EmAiLAdDrEsS%40sOmEMaIlProvider.com&password=UnEnCoDeDpAsSwOrD
Pay particular attention to the stuff below "Content-Length" field and all the parameters that come after it.
Now log back out, and pull up the login site page again... chances are, you will see some of those fields hidden as state information in <input type="hidden">
tags... some web apps keep state by feeding you fields and then they use javascript to resubmit that same information in your login POST. I usually use lxml to parse the pages I receive... if you try it, keep in mind that lxml prefers utf-8, so I include code that automagically converts when it sees other encodings...
response = urlopen(req,data)
# info is from the HTTP headers... like server version
info = response.info().dict
# page is the HTML response
page = response.read()
encoding = chardet.detect(page)['encoding']
if encoding != 'utf-8':
page = page.decode(encoding, 'replace').encode('utf-8')
BTW, Michael Foord has a very good reference on urllib2 and many of the assorted issues.
So, in summary:
authURL
in addition to email
and password
... if possible, I try to mimic what the browser sends...Upvotes: 4