JMFisac
JMFisac

Reputation: 41

Getting request information instead of HTML when using httpx with proxies in Python

I'm encountering an issue while attempting to make an HTTP request using the httpx library in Python with proxies. I'm trying to fetch the HTML text of a webpage, but instead, I'm receiving a response that seems to be information about the request itself. Here's my code:

import httpx

def httpx_get_html_from_url_proxies(url, proxies, headers_list):
    status_code = None
    html = None
    for proxy in proxies:
        proxies = {"http://": f'http://{proxy}', "https://": f'http://{proxy}'}
        try:
            req = httpx.get(url,
                            proxies=proxies, 
                            verify=False, 
                            follow_redirects=True, 
                            headers=get_random_header(headers_list))
            status_code = req.status_code
            if req.status_code == 200:
                html = req.text
            else:
                print(f'httpx proxies code: {req.status_code}')
                log_error(f'httpx proxies code: {req.status_code}')
        except Exception as e:
            print(f'httpx proxies error: {e}')
            log_error(f'httpx proxies error: {e}')
        if html:
            break
    return html, status_code

Instead of getting the HTML of the webpage, I receive information about the request itself, such as REMOTE_ADDR, REQUEST_METHOD, etc.

REMOTE_ADDR = ......
REMOTE_PORT = .....
REQUEST_METHOD = GET
REQUEST_URI = ......
REQUEST_TIME_FLOAT = 1715805896.0432796
REQUEST_TIME = 1715805896
HTTP_HOST = ........
HTTP_CONNECTION = keep-alive
HTTP_UPGRADE-INSECURE-REQUESTS = 1
HTTP_USER-AGENT = .......
HTTP_ACCEPT = text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;
q=0.8,application/signed-exchange;v=b3
HTTP_SEC-CH-UA = Google Chrome;v="86", "Chromium";v="86", ";Not A Brand";v="99"
HTTP_SEC-CH-UA-MOBILE = ?0
HTTP_SEC-CH-UA-PLATFORM = Windows
HTTP_SEC-FETCH-SITE = none
HTTP_SEC-FETCH-MOD = 
HTTP_SEC-FETCH-USER = ?1
HTTP_ACCEPT-ENCODING = gzip, deflate
HTTP_ACCEPT-LANGUAGE = en-US,fr;q=0.8
HTTP_REFERER = .........
HTTP_SEC-FETCH-DEST = document
HTTP_SEC-FETCH-MODE = navigate
HTTP_SEC_FETCH_SITE = cross-site

I have used ellipses for privacy reasons.

Can someone help me understand why I'm receiving this response and how I can fix it to properly retrieve the HTML of the webpage?

Thank you in advance for any assistance you can provide!

Upvotes: 0

Views: 66

Answers (0)

Related Questions