Saimu
Saimu

Reputation: 1362

Log in with Python and Requests

I've been trying to access a website with no API. I want to retreive my current "queue" from the website. But it won't let me access this part of the website if i'm not logged in. Here is my code :

login_data = { 
    'action': 'https://www.crunchyroll.com/?a=formhandler',
    'name': 'my_username',
    'password': 'my_password' 
}



import requests

with requests.Session() as s:
    s.post('https://www.crunchyroll.com/login', data=login_data)
    ck = s.cookies
    r = s.get('https://www.crunchyroll.com/home/queue')
    print r.text

Right now, I get a page :

<html lang="en">
  <head>
    <title>Redirecting...</title>
    <meta http-equiv="refresh" content="0;url=http://www.crunchyroll.com/home/queue" />
  </head>
  <body>
    <script type="text/javascript">
      document.location.href="http:\/\/www.crunchyroll.com\/home\/queue";
    </script>
  </body>
</html>

I think it should work, but I'm only getting the redirecting page ... How am I suppose to get past that ?

Thanks !

Upvotes: 0

Views: 630

Answers (1)

snowcrash09
snowcrash09

Reputation: 4804

The redirect is happening because you are not logging into the site properly - you have the wrong form URL for the POST request, and you're not POSTing all the form data the site is expecting.

You can figure out what is required to login by looking at the source code for https://www.crunchyroll.com/login. The parts that matter are the <form> tag and <input> tags:

<form id="RpcApiUser_Login" method="post" action="https://www.crunchyroll.com/?a=formhandler">
<input type="hidden" name="formname" value="RpcApiUser_Login" />
<input type="text" name="name" value="my_user_name_goes_here" /></td>
<input type="password" name="my_password_goes_here" /></td>
</form>

When this means is that when you click Submit, there is a POST request to the URL https://www.crunchyroll.com/?a=formhandler, with key/value pairs of data like formname=RpcApiUser_Login. To replicate this in Python you need to POST all this same pairs of data to that URL.

To learn more about CGI programming like this, look here.

Try this Python code, it works:

import requests

login_data = { 
    'name': 'my_username',
    'password': 'my_password' 
    'formname': 'RpcApiUser_Login'
}

with requests.Session() as s:
    s.post('https://www.crunchyroll.com/?a=formhandler', data=login_data)
    r = s.get('http://www.crunchyroll.com/home/queue')
    print r.text

Upvotes: 1

Related Questions