Reputation: 51
I have response parser boost::beast::http::parser<false, boost::beast::http::buffer_body>
. As i understand, buffer_body means that the response body data should be stored in the user-provided buffer. But, when i set the chunk callback on the parser using on_chunk_body
method of parser, it seems like that the parser does not use provided buffer. In this case, it also works when no buffer provided
So, i need to understand how http parser manages memory when receives chunk? It use some internal buffer or what?
It seems like the parser uses provided buffer only for non-chunked response. If yes, it is correct to provide no buffer for chunked responses?
Upvotes: 2
Views: 1395
Reputation: 393769
Beast supports chunked encoding. You do not need to deal with it. Lets demonstrate that by downloading a chunked response from httpbin.org
:
vector_body
To remove the confusing part:
void using_vector_body() {
tcp::socket conn = send_get();
http::response<http::vector_body<uint8_t>> res;
beast::flat_buffer buf;
read(conn, buf, res);
std::cout << "response: " << res.base() << "\n";
std::span body = res.body();
fmt::print("body, {} bytes: {::0x} ... {::0x}\n", body.size(), body.first(10), body.last(10));
auto checksum = reduce(begin(body), end(body), 0ull, std::bit_xor<>{});
fmt::print("body checksum: {:#0x}\n", checksum);
}
Prints e.g.
response: HTTP/1.1 200 OK
Date: Wed, 22 Mar 2023 00:43:25 GMT
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
body, 2000 bytes: [39, c, 8c, 7d, 72, 47, 34, 2c, d8, 10] ... [fd, 18, b0, c3, a3, d5, d1, 4c, 99, c0]
body checksum: 0xa1
buffer_body
We need to use the parser interface because it will drive the body-reader, and we need to monitor is_done()
on the parser.
For good style, we may replace the initial
read(conn, buf, p, ec);
with the more intentional:
read_header(conn, buf, p, ec);
We will receive need_buffer
errors, so we need to deal with them. Then, we need to repeatedly set the body reader value to our buffer, and see what was actually decoded.
NOTE Do not use the returned
bytes_transferred
from the[async_]read
call here, because that will include everything from the wire, including the chunk header and trailer headers.The interface to calculate the decoded buffer bytes is
atrociousvery unfriendly. But it is what you need.
Without further ado:
void using_buffer_body() {
tcp::socket conn = send_get();
http::response_parser<http::buffer_body> p;
auto& res = p.get(); // convenience shorthands
auto& body_val = res.body();
beast::flat_buffer buf;
error_code ec;
read_header(conn, buf, p, ec);
//read(conn, buf, p, ec);
if (ec && ec != http::error::need_buffer) // expected
throw boost::system::system_error(ec);
assert(p.is_header_done());
std::cout << "\n---\nresponse headers: " << res.base() << std::endl;
size_t checksum = 0;
size_t n = 0;
while (!p.is_done()) {
std::array<uint8_t, 512> block;
body_val.data = block.data();
body_val.size = block.size();
read(conn, buf, p, ec);
if (ec && ec != http::error::need_buffer) // expected
throw boost::system::system_error(ec);
auto curr = block.size() - body_val.size;
n += curr;
std::cout << "parsed " << curr << " body bytes\n";
for (auto b : std::span(block).first(curr))
checksum ^= b;
}
fmt::print("body, {} bytes streaming decoded, chunked? {}\n", n, p.chunked());
fmt::print("body checksum: {:#0x}\n", checksum);
}
The demo confirms that both methods result in the same body length with the same checksum:
#include <boost/beast.hpp>
#include <fmt/ranges.h>
#include <iostream>
#include <span>
namespace net = boost::asio;
namespace beast = boost::beast;
namespace http = beast::http;
using boost::system::error_code;
using net::ip::tcp;
tcp::socket send_get() {
net::system_executor ex;
tcp::socket s(ex);
connect(s, tcp::resolver(ex).resolve("httpbin.org", "http"));
http::request<http::empty_body> req{http::verb::get, "/stream-bytes/2000?seed=42", 11};
req.set(http::field::host, "httpbin.org");
write(s, req);
return s;
}
void using_vector_body() {
tcp::socket conn = send_get();
http::response<http::vector_body<uint8_t>> res;
beast::flat_buffer buf;
read(conn, buf, res);
std::cout << "response: " << res.base() << "\n";
std::span body = res.body();
size_t const n = body.size();
fmt::print("body, {} bytes: {::0x} ... {::0x}\n", n, body.first(10), body.last(10));
auto checksum = reduce(begin(body), end(body), 0ull, std::bit_xor<>{});
fmt::print("body checksum: {:#0x}\n", checksum);
}
void using_buffer_body() {
tcp::socket conn = send_get();
http::response_parser<http::buffer_body> p;
auto& res = p.get(); // convenience shorthands
auto& body_val = res.body();
beast::flat_buffer buf;
error_code ec;
read_header(conn, buf, p, ec);
//read(conn, buf, p, ec);
if (ec && ec != http::error::need_buffer) // expected
throw boost::system::system_error(ec);
assert(p.is_header_done());
std::cout << "\n---\nresponse headers: " << res.base() << std::endl;
size_t checksum = 0;
size_t n = 0;
while (!p.is_done()) {
std::array<uint8_t, 512> block;
body_val.data = block.data();
body_val.size = block.size();
read(conn, buf, p, ec);
if (ec && ec != http::error::need_buffer) // expected
throw boost::system::system_error(ec);
size_t decoded = block.size() - body_val.size;
n += decoded;
std::cout << "parsed " << decoded << " body bytes\n";
for (auto b : std::span(block).first(decoded))
checksum ^= b;
}
fmt::print("body, {} bytes streaming decoded, chunked? {}\n", n, p.chunked());
fmt::print("body checksum: {:#0x}\n", checksum);
}
int main() {
using_vector_body();
using_buffer_body();
}
Prints e.g.
response: HTTP/1.1 200 OK
Date: Wed, 22 Mar 2023 00:52:32 GMT
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
body, 2000 bytes: [39, c, 8c, 7d, 72, 47, 34, 2c, d8, 10] ... [fd, 18, b0, c3, a3, d5, d1, 4c, 99, c0]
body checksum: 0xa1
---
response headers: HTTP/1.1 200 OK
Date: Wed, 22 Mar 2023 00:52:32 GMT
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
parsed 512 body bytes
parsed 512 body bytes
parsed 512 body bytes
parsed 464 body bytes
body, 2000 bytes streaming decoded, chunked? true
body checksum: 0xa1
Upvotes: 2