John Heyer
John Heyer

Reputation: 911

Python http.client using HTTP Proxy server going to non-HTTPS site

I'm working to retrieve API data in an environment where an outbound HTTP/HTTPS proxy server is required. When connecting to the site using HTTPS via the proxy using Tunneling, it works fine. Here's example code:

import http.client

PROXY = {'host': "10.20.30.40", 'port': 3128}
TARGET = {'host': "example.com", 'port': 443, 'url': "/api/v1/data"}
HEADERS = {'Host': TARGET.get('host'), 'User-agent': "Python http.client"}
TIMEOUT = 10

if port := TARGET.get('port') == 443:
    conn = http.client.HTTPSConnection(PROXY.get('host'), port=PROXY.get('port'), timeout=TIMEOUT)
    conn.set_tunnel(host=TARGET.get('host'), port=TARGET.get('port', 443))
else:
    conn = http.client.HTTPConnection(PROXY.get('host'), port=PROXY.get('port'), timeout=TIMEOUT)
conn.request(method="GET", url=TARGET.get('url', "/"), headers=HEADERS)
response = conn.getresponse()
conn.close()
print(response.status, response.reason)

I also want to support plain HTTP URLs, and have tried this:

TARGET = {'host': "example.com", 'port': 80, 'url': "/api/v1/data"}

The proxy replies with a 400 / Bad request error. Here's the log:

192.168.1.100 NONE_NONE/400 3885 GET /api/v1/data - HIER_NONE/- text/html

A test curl to the same URL shows up in the proxy as this:

192.168.1.100 TCP_MISS/200 932 GET http://example.com/api/v1/data - HIER_DIRECT/203.0.113.132 application/json

This makes some sense. When using tunneling, the web server's host is configured in set_tunnel(). But HTTP does not require that step. I was thinking the HTTP "host" header would take care of this, but could be mistaken. What am I missing?

Upvotes: 0

Views: 23

Answers (0)

Related Questions