makaramkd
makaramkd

Reputation: 46

Getting 302 with Curl, but 200 with python requests

The site that I am trying to scrape must follow 302 observed by the browser network tools. I've copied the network request as Curl and its working fine, but when I convert it to python requests its just returning 200.

Curl:

curl -v 'https://my-site.com/CardAuthentication.aspx' \
  -H 'Connection: keep-alive' \
  -H 'Pragma: no-cache' \
  -H 'Cache-Control: no-cache' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'Origin: https://my-site.com' \
  -H 'Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBN2XwBbcAwvRWZzk' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Sec-GPC: 1' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Referer: https://my-site.com/CardAuthentication.aspx' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'Cookie: ASP.NET_SessionId=...; TS0192fa71=...; ServiceProvider=UID=...; .ASPXAUTH=...' \
  --data-raw $'------WebKitFormBoundaryBN2XwBbcAwvRWZzk\r\ DATA \r\n' \
  --compressed

this is returning:

*   Trying IP:443...
........
........
< HTTP/1.1 302 Found
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: text/html; charset=utf-8
< Expires: -1
< Location: my-site.com/MemberDetails.aspx
< Date: Mon, 07 Feb 2022 09:54:49 GMT
< Content-Length: 61211
< Set-Cookie: TS0192fa71=...; Path=/; Domain=.my-site.com; Secure

Python:

import requests
import logging


logging.basicConfig(level=logging.DEBUG)
cookies = {
    "ASP.NET_SessionId": ".....",
    "TS0192fa71": ".....",
    "ServiceProvider": ".....",
    ".ASPXAUTH": ".....",
}

headers = {
    ......
    "Content-Type": "multipart/form-data; boundary=----WebKitFormBoundaryBN2XwBbcAwvRWZzk",
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
    "Referer": "https://my-site.com/CardAuthentication.aspx",
    "Accept-Language": "en-US,en;q=0.9",
}

data = {
    "------WebKitFormBoundaryBN2XwBbcAwvRWZzk\r\nContent-Disposition: form-data; name": '"__EVENTTARGET"\r\n\r\n\r\n-- DATA
}

response = requests.post(
    "https://my-site.com/CardAuthentication.aspx",
    headers=headers,
    cookies=cookies,
    data=data,
)

And the return code is 200, with response history empy.

Is there something wrong with requests library or in the way that request library is processing the data? How can I solve this?

Upvotes: 0

Views: 900

Answers (1)

Daweo
Daweo

Reputation: 36370

Is there something wrong with requests library or in the way that request library is processing the data? How can I solve this?

This is default requests behavior, you need to set allow_redirects to False if you want to not follow in case of 3xx response code, example

import requests
r = requests.get("http://github.com", allow_redirects=False)
print(r.status_code)  # 301

If you want to know more read request.request docs.

Upvotes: 2

Related Questions