Reputation: 43
I am attempting to log in to a website using Requests and seem to be hitting a wall. Any advice would be appreciated.
I'm attempting to log in to economist.com (no reason, just something I have a username and password for), whose login page is at https://www.economist.com/user/login
and whose login form has the attribute action="https://www.economist.com/user/login?destination=%2F"
.
Using the Chrome developer tools, the form data for a login request is as follows:
name: /////////
pass: ////////
form-build-id: form-483956e97a61f73fbc0ebf06b04dbe3f
form_id: user_login
securelogin_original_baseurl: https://www.economist.com
op: Log in
My code GETs the login page, uses BeautifulSoup to determine the form_id; attempts to POST to the login using my username and password, the retrieved form_id, and the other hidden variables; and then uses BeautifulSoup to check the homepage to see if the banner has a login or logout link to determine if I have actually logged in.
The code is as follows:
import requests
from bs4 import BeautifulSoup
# Setting user agent to a real browser instead of requests
headers = requests.utils.default_headers()
headers.update(
{
'User-Agent': 'Mozilla/5.0',
}
)
# create a session and login
s = requests.Session()
login_page = s.get('https://www.economist.com/user/login', headers=headers)
login = BeautifulSoup(login_page.text, 'lxml')
form = login.select_one("form > div > input")
payload = {
'name' : '////////////',
'pass' : '////////',
'form_build_id' : form['value'],
'form_id' : 'user_login',
'securelogin_original_baseurl' : 'https://www.economist.com',
'op' : 'Log in'
}
response = s.post("https://www.economist.com/user/login?destination=%2F",
data=payload, headers=headers)
# check homepage banner to see if login or logout link is there
url = "https://www.economist.com/"
r = s.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'lxml')
banner = soup.select("div > div > span > a")
for table_row in banner:
print(table_row['href'])
When run, this code shows that the banner still has the login link instead of the logout link, which, I assume, means that it's not logged in. I know I must have made some very simple mistake in here, but after reading through other similar questions on here, I can't seem to find where I'm going awry. I'd appreciate any advice on making this work.
Upvotes: 4
Views: 1311
Reputation: 144
I tried your code and only 1 thing did not work with me.
form = login.select_one("form > div > input")
To:
form = login.find('input', attrs={'name': "form_build_id"})
Then login normally, and to make sure if am logged in or not, i get a page that only logged in users can visit. http://www.economist.com/subscriptions/activation
if you can visit this page, then you are logged in, or you will be redirected to https://www.economist.com/user/register?destination=subscriptions%2Factivation&rp=activating
import requests
from bs4 import BeautifulSoup
# Setting user agent to a real browser instead of requests
headers = requests.utils.default_headers()
headers.update(
{
'User-Agent': 'Mozilla/5.0',
}
)
# create a session and login
s = requests.Session()
login_page = s.get('https://www.economist.com/user/login', headers=headers)
login = BeautifulSoup(login_page.text, 'lxml')
form = login.find('input', attrs={'name': "form_build_id"})#works
payload = {
'name' : '*****',
'pass' : '*****',
'form_build_id' : form['value'],
'form_id' : 'user_login',
'securelogin_original_baseurl' : 'https://www.economist.com',
'op' : 'Log in'
}
response = s.post("https://www.economist.com/user/login?destination=%2F",
data=payload, headers=headers)
activation_page = s.get('http://www.economist.com/subscriptions/activation', headers=headers)
if activation_page.url == 'https://www.economist.com/user/register?destination=subscriptions%2Factivation&rp=activating':
print"Failed to login"
elif activation_page.url == 'http://www.economist.com/subscriptions/activation':
print"Logged In Successfully!"
Upvotes: 1