Reputation: 6259
I am facing the following scenario: I am forced to use an HTTP proxy to connect to an HTTPS server. For several reasons I need access to the raw data (before encryption) so I am using the socket library instead of one of the HTTP specific libraries. I thus first connect a TCP socket to the HTTP proxy and issue the connect command.
At this point, the HTTP proxy accepts the connection and seemingly forwards all further data to the target server. However, if I now try to switch to SSL, I receive
error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
indicating that the socket attempted the handshake with the HTTP proxy and not with the HTTPS target.
Here's the code I have so far:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('proxy',9502))
s.send("""CONNECT en.wikipedia.org:443 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:15.0) Gecko/20100101 Firefox/15.0.1
Proxy-Connection: keep-alive
Host: en.wikipedia.org
""")
print s.recv(1000)
ssl = socket.ssl(s, None, None)
ssl.connect(("en.wikipedia.org",443))
What would be the correct way to open an SSL socket to the target server after connecting to the HTTP proxy?
Upvotes: 1
Views: 6302
Reputation: 252
works with python 3
< proxy > is an ip or domain name
< port > 443 or 80 or whatever your proxy is listening to
< endpoint > your final server you want to connect to via the proxy
< cn > is an optional sni field your final server could be expecting
import socket,ssl
def getcert_sni_proxy(cn,endpoint,PROXY_ADDR=("<proxy>", <port>)):
#prepare the connect phrase
CONNECT = "CONNECT %s:%s HTTP/1.0\r\nConnection: close\r\n\r\n" % (endpoint, 443)
#connect to the actual proxy
conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn.connect(PROXY_ADDR)
conn.send(str.encode(CONNECT))
conn.recv(4096)
#set the cipher for the ssl layer
context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
#connect to the final endpoint via the proxy, sending an optional servername information [cn here]
sock = context.wrap_socket(conn, server_hostname=cn)
#retreive certificate from the server
certificate = ssl.DER_cert_to_PEM_cert(sock.getpeercert(True))
return certificate
Upvotes: 0
Reputation: 122719
(Note that in generally, it would be easier to use an existing HTTPS library such as PyCurl, instead of implementing it all by yourself.)
Firstly, don't call your variable ssl
. This name is already used by the ssl
module, so you don't want to hide it.
Secondly, don't use connect
a second time. You're already connected, what you need is to wrap the socket. Since Python doesn't do any certificate verification by default, you'll need to verify the remote certificate and verify the host name too.
Here are the steps involved:
CONNECT
like you're doing in the first few lines.ssl_s = ssl.wrap_socket(s, cert_reqs=ssl.CERT_REQUIRED, ssl_version=ssl.PROTOCOL_TLS1, ca_certs='/path/to/cabundle.pem')
to wrap the socket. Then, verify the host name. It's worth reading this answer: the connect
method and what it does after wrapping the socket.ssl_s
as if it was your normal socket. Don't call connect
again.Upvotes: 1