Reputation: 43
I'm just starting out with Python web data in Python 3.6.1. I was learning sockets and I had a problem with my code which I couldn't figure out. The website in my code works fine, but when I run this code I get a 400 Bad Request error. I am not really sure what the problem with my code is. Thanks in advance.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
mysock.send(('GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n').encode())
while True:
data = mysock.recv(512)
if ( len(data) < 1 ):
break
print (data)
mysock.close()
Upvotes: 4
Views: 4831
Reputation: 1
This code worked for me:
GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n
\n\n
to \r\n\r\n
HTTP/1.0
and \r\n\r\n
Upvotes: 0
Reputation: 31
'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
works for me.
Upvotes: 3
Reputation: 123375
GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n
Welcome in the wonderful world of HTTP where most users think that this is an easy protocol since it is a human readable but in fact it can be a very complex protocol. Given your request above there are several problems:
/romeo.txt
. Full URL's will be used only when doing a request to a proxy.\r\n
not \n
.HTTP/1.0
before the end of the line.With this in mind the data you send should be instead
GET /romeo.txt HTTP/1.0\r\nHost: data.pr4e.org\r\n\r\n
And I've tested that it works perfectly with this modification.
But, given that HTTP is not as simple as it might look I really recommend to use a library like requests for accessing the target. If this looks like too much overhead to you please study the HTTP standard to implement it properly instead of just guessing how HTTP works from some examples - and guessing it wrong.
Note also that servers differ in how forgiving they are regarding broken implementations like yours. Thus, what once worked with one server might not work with the next server or even with the same server after some software upgrade. Using a robust and well tested and maintained library instead of doing everything on your own might thus save you lots of troubles later too.
Upvotes: 7
Reputation:
You don't send the protocol to the Web server, and you only send the hostname separately in a Host
header, and only then in HTTP 1.1.
For HTTP 1.0, it should be:
mysock.send('GET /romeo.txt HTTP/1.0\r\n\r\n')
Alternatively, you could try sending an HTTP 1.1 request:
mysock.send('GET /romeo.txt HTTP/1.1\r\n')
mysock.send('Host: data.pr4e.org\r\n\r\n')
Upvotes: 1