Reputation: 52
I'm creating a HTTP proxy in python but I'm having trouble in the fact that my proxy will only accept the webservers response and will completely ignore the browsers next request and the transfer of data just stops. Here's the code:
import socket
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
bhost = '192.168.1.115'
port = 8080
s.bind((bhost, port))
s.listen(5)
def server(sock, data, host):
p = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
p.connect((host, 80))
p.send(data)
rdata = p.recv(1024)
print(rdata)
sock.send(rdata)
while True:
sock, addr = s.accept()
data = sock.recv(1024)
host = data.splitlines()[1][6:]
server(sock, data, host)`
Sorry about the code this is just a trial version and help will be much appreciated as I am only 14 and have much to learn :-)
Upvotes: 1
Views: 1035
Reputation: 3215
Unfortunately I don't really see how your code should work, so I'm putting here my thoughts of how should a simple HTTP proxy look like. So what should a basic proxy server do:
Connection: keep-alive
.Let's go step by step and write some very simplified code.
How does proxy accepts a client. A socket should be created and moved to passive mode:
import socket, select
sock = socket.socket()
sock.bind((your_ip, port))
sock.listen()
while True:
client_sock = sock.accept()
do_stuff(client_sock)
Once the TCP connection is established, it's time receive a request. Let's assume we're going to get something like this:
GET /?a=1&b=2 HTTP/1.1
Host: localhost
User-Agent: my browser details
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
In TCP, message borders aren't preserved, so we should wait until we get at least first two lines (for GET request) in order to know what to do later:
def do_stuff(sock):
data = receive_two_lines(sock)
remote_host = parse_request(data)
After we have got the remote hostname, it's time to forward the requests and responses:
def do_stuff(client_sock):
data = receive_two_lines(client_sock)
remote_host = parse_request(data)
remote_ip = socket.getaddrinfo(remote_host) # see the docs for exact use
webserver = socket.socket()
webserver.connect((remote_ip, 80))
webserver.sendall(data)
while it_makes_sense():
client_ready = select.select([client_sock], [], [])[0]
web_ready = select.select([webserver], [], [])[0]
if client_ready:
webserver.sendall(client_sock.recv(1024))
if web_ready:
client_sock.sendall(webserver.recv(1024))
Please note select
- this is how we know if a remote peer has sent us data. I haven't run and tested this code and there are thing left to do:
client_sock.recv(1024)
call, because again, message borders aren't preserved in TCP. Probably, look additional get requests each time you receive data.Connection: keep-alive
option in the headers, but they also may decide to drop it. Be ready to detect disconnects and sockets closed by a remote peer (for simplicity sake, this is called while it_makes_sense()
in the code).bind
, listen
, accept
, recv
, send
, sendall
, getaddrinfo
, select
- all these functions can throw exceptions. It's better to catch them and act accordingly. Upvotes: 1