Cornstalks
Cornstalks

Reputation: 38238

What can cause Rust's TcpSocket::write() to return "invalid input"?

For a little fun I wanted to make a simple HTTP request in Rust. I threw this together and it works great:

use std::io::TcpStream;

fn main() {
    // This just does a "GET /" to www.stroustrup.com
    println!("Establishing connection...");
    let mut stream = TcpStream::connect("www.stroustrup.com:80").unwrap();

    println!("Writing HTTP request...");
    // unwrap() the result to make sure it succeeded, at least
    let _ = stream.write(b"GET / HTTP/1.1\r\n\
                           Host: www.stroustrup.com\r\n\
                           Accept: */*\r\n\
                           Connection: close\r\n\r\n").unwrap();

    println!("Reading response...");
    let response = stream.read_to_string().unwrap();

    println!("Printing response:");
    println!("{}", response);
}

Response is:

Establishing connection...
Writing HTTP request...
Reading response...
Printing response:
HTTP/1.1 200 OK
...and the rest of the long HTTP response with all the HTML as I'd expect...

However, if I change the request to be /C++.html instead of /:

use std::io::TcpStream;

fn main() {
    // The only change is to "GET /C++.html" instead of "GET /"
    println!("Establishing connection...");
    let mut stream = TcpStream::connect("www.stroustrup.com:80").unwrap();

    println!("Writing HTTP request...");
    // unwrap() the result to make sure it succeeded, at least
    let _ = stream.write(b"GET /C++.html HTTP/1.1\r\n\
                           Host: www.stroustrup.com\r\n\
                           Accept: */*\r\n\
                           Connection: close\r\n\r\n").unwrap();

    println!("Reading response...");
    let response = stream.read_to_string().unwrap();

    println!("Printing response:");
    println!("{}", response);
}

The socket returns "invalid input":

Establishing connection...
Writing HTTP request...
Reading response...
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: invalid input', /Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/src/libcore/result.rs:746

Why does the socket return "invalid input"? The TCP socket isn't aware of the HTTP protocol (and I've tested my request with telnet and netcat: it's correct), so it can't be complaining about HTTP request/response.

What does "invalid input" even mean here? Why doesn't this work?

My rust version (I'm on OS X 10.10.1):

$ rustc --version
rustc 1.0.0-nightly (ea6f65c5f 2015-01-06 19:47:08 +0000)

Upvotes: 8

Views: 2378

Answers (2)

Assembler
Assembler

Reputation: 11

The offending characters are 0x96, indeed invalid utf-8. It should be U+2013 – The document is either iso-8859-1 or windows 1252. There are a number of other problems with that HTML, such as unescaped &'s.

Upvotes: 1

Cornstalks
Cornstalks

Reputation: 38238

The "invalid input" error isn't coming from the socket. It's coming from String. If the read_to_string() call is changed to read_to_end(), then the response is successful. Apparently the response isn't valid UTF-8.

More explicitly, the code:

println!("Reading response...");
let response = stream.read_to_end().unwrap();

println!("Printing response:");
println!("{}", String::from_utf8(response));

returns:

Err(invalid utf-8: invalid byte at index 14787)

So the HTTP response is bad. Looking at the web page, the error is here (the characters are the problem):

Lang.Next'14 Keynote: What � if anything � have we learned from C++?

Upvotes: 10

Related Questions