Reputation: 2859
The code below works, it's sending all the correct data, and it's receiving the correct data.
When I use it to benchmark a very fast server, the benchmark's CPU usage is ~10%. However, when I benchmark a slow server, that rises to ~50% – the same as the server I'm benchmarking/stress testing*.
That is going by what top
's reporting.
Why would it use so much CPU? I suspect I'm misusing poll, but I'm not sure how?
The CPU time for the slow server is 4x that of the benchmark, while for the fast server it is 7x that of the benchmark.
int flags = fcntl(sockfd, F_GETFL, 0);
assert(flags != -1);
assert(fcntl(sockfd, F_SETFL, flags | O_NONBLOCK) != -1);
int32 red = 0;
struct pollfd pollfd = {
.fd = sockfd,
.events = POLLIN | POLLOUT
};
do {
assert(poll(&pollfd, 1, -1) == 1);
if (pollfd.revents & POLLOUT) {
int n;
while ((n = send(sockfd, buf__+bufOffset, bufLength-bufOffset, MSG_NOSIGNAL)) > 0) {
bufOffset += n;
if (n != bufLength-bufOffset)
break;
}
assert(!(n == -1 && errno != EAGAIN && errno != EWOULDBLOCK));
}
if (pollfd.revents & POLLIN) {
int r;
while ((r = read(sockfd, recvBuf, MIN(recvLength-red, recvBufLength))) > 0) {
// assert(memcmp(recvBuf, recvExpectedBuf+red, r) == 0);
red += r;
if (r != MIN(recvLength-red, recvBufLength))
break;
}
assert(!(r == -1 && errno != EAGAIN && errno != EWOULDBLOCK));
}
} while (bufOffset < bufLength);
assert(fcntl(sockfd, F_SETFL, flags & ~O_NONBLOCK) != -1);
int r;
while ((r = read(sockfd, recvBuf, MIN(recvLength-red, recvBufLength))) > 0) {
// assert(memcmp(recvBuf, recvExpectedBuf+red, r) == 0);
red += r;
}
assert(fcntl(sockfd, F_SETFL, flags | O_NONBLOCK) != -1);
assert(red == recvLength);
int r = read(sockfd, recvBuf, 1);
assert((r == -1 && (errno == EAGAIN || errno == EWOULDBLOCK)) || r == 0);
* (I'm running both benchmark and server on the same machine, for now. Communication is over TCP.)
Upvotes: 0
Views: 212
Reputation: 2859
Problem solved.
It wasn't misrepresented CPU usage exactly. The inefficient server was sending 8 byte packages with TCP_NODELAY, so I was receiving millions of poll notifications to read just 8 bytes. It turns out the read(2) call was rather expensive, and calling it tens of thousands of times per second was enough to see "time spent in system mode" rocket to ~56%, which was added to "time spent in user mode" to produce the very high CPU usage.
Upvotes: 0
Reputation:
So if I finally understood this, you're comparing the ratio of %CPU
reported by top to the ratio of the rate of increase of TIME+
reported by top, and they don't agree. (It would have been easier if you said which columns you were reading from!) As far as I can tell both are calculated from the same fields in the underlying /proc
data, so it shouldn't be possible for them to disagree by much.
And I can't replicate it. I've put your code into a test program and run it with no modifications other than fixing a redeclaration of int r
compilation error and adding what I believe to be reasonable declarations for all the stuff you left out. I connected it to a server that reads lines from the client and eats a little bit of CPU after each one before sending a line back. The result was that top showed %CPU
around 99 for the server and 2 for the client and about a 50-to-1 ratio in the TIME+
column.
I find nothing wrong with the use of poll
.
I don't like your use of assert
though - when assertions are turned off the program is going to be missing a lot of important syscalls.
Upvotes: 0
Reputation: 25129
The reason is that you are busy-waiting. If the read
and write
return EAGAIN
or EWOULDBLOCK
you are calling them continuously. Add a select
which will wait until the socket is ready for reading or writing before that.
Upvotes: 1