user54808
user54808

Reputation:

Problem supporting keep-alive sockets on a home-grown http server

I am currently experimenting with building an http server. The server is multi-threaded by one listening thread using select(...) and four worker threads managed by a thread pool. I'm currently managing around 14k-16k requests per second with a document length of 70 bytes, a response time of 6-10ms, on a Core I3 330M. But this is without keep-alive and any sockets I serve I immediatly close when the work is done.

EDIT: The worker threads processes 'jobs' that have been dispatched when activity on a socket is detected, ie. service requests. After a 'job' is completed, if there are no more 'jobs', we sleep until more 'jobs' gets dispatched or if there already are some available, we start processing one of these.

My problems started when I began to try to implement keep-alive support. With keep-alive activated I only manage 1.5k-2.2k requests per second with 100 open sockets. This number grows to around 12k with 1000 open sockets. In both cases the response time is somewhere around 60-90ms. I feel that this is quite odd since my current assumptions says that requests should go up, not down, and response time should hopefully go down, but definitely not up.

I've tried several different strategies for fixing the low performance:

    1. Call select(...)/pselect(...) with a timeout value so that we can rebuild our FD_SET structure and listen to any additional sockets that arrived after we blocked, and service any detected socket activity. (aside from the low performance, there's also the problem of sockets being closed while we're blocking, resulting in select(...)/pselect(...) reporting bad file descriptor.)
    2. Have one listening thread that only accept new connections and one keep-alive thread that is notified via a pipe of any new sockets that arrived after we blocked and any new socket activity, and rebuild the FD_SET. (same additional problem here as in '1.').
    3. select(...)/pselect(...) with a timeout, when new work is to be done, detach the linked-list entry for the socket that has activity, and add it back when the request has been serviced. Rebuilding the FD_SET will hopefully be faster. This way we also avoid trying to listen to any bad file descriptors.
    4. Combined (2.) and (3.).

    -. Probably a few more, but they escape me atm.

The keep-alive sockets are stored in a simple linked List, whose add/remove methods are surrounded by a pthread_mutex lock, the function responsible for rebuilding the FD_SET also has this lock.

I suspect that it's the constant locking/unlocking of the mutex that is the main culprit here, I've tried to profile the problem but neither gprof or google-perftools has been very cooperative, either introducing extreme instability or plain refusing to gather any data att all (This could be me not knowing how to use the tools properly though.). But removing the locks risks putting the linked list in a non-sane state and probably crash or put the program into an infinite loop. I've also suspected the select(...)/pselect(...) timeout when I've used it, but I'm pretty confident that this was not the problem since the low performance is maintained even without it.

I'm at a loss of how I should handle keep-alive sockets and I'm therefor wondering if you people out there has any suggestions on how to fix the low performance or have suggestions on any alternate methods I can use to go about supporting keep-alive sockets.

Upvotes: 11

Views: 1794

Answers (5)

The Lazy Coder
The Lazy Coder

Reputation: 11838

The time increase will be more visible when the client uses your socket for more then one request. If you are merely opening and closing yet still telling the client to keep alive then you have the same scenario as you did without keepalive. But now you have the overhead of the sockets sticking around.

If however you are using the sockets multiple times from the same client for multiple requests then you will lose the TCP connection overhead and gain performance that way.

Make sure your client is using keepalive properly. and likely a better way to get notification of the sockets state and data. Perhaps a poll device or queuing the requests.

http://www.techrepublic.com/article/using-the-select-and-poll-methods/1044098

This page has a patch for linux to handle a poll device. Perhaps some understanding of how it works and you can use the same technique in your application rather then rely on a device that may not be installed.

Upvotes: 0

Chris Cleeland
Chris Cleeland

Reputation: 4900

What you are trying to do has been done before. Consider reading about the Leader-Follower network server pattern, http://www.kircher-schwanninger.de/michael/publications/lf.pdf

Upvotes: 0

David Waters
David Waters

Reputation: 12028

Are your test clients reusing the sockets? Are they correctly handling keep alive? I could see that case where you do the minimum change possible in your benchmarking code by just passing the keep alive header, but then not changing your code so that the socket is closed at the client end once the pay packet is received. This would incure all the costs of keep-alive with none of the benefits.

Upvotes: 0

ninjalj
ninjalj

Reputation: 43708

There are many alternatives:

  • Use processes instead of threads, and pass file descriptors via Unix sockets.
  • Maintain per-thread lists of sockets. You could even accept() directly on the worker threads.
  • etc...

Upvotes: 0

blaze
blaze

Reputation: 4364

Try to get rid of select completely. You can find some kind of event notification on every popular platform: kqueue/kevent on freebsd(), epoll on Linux, etc. This way you do not need to rebuild FD_SET and can add/remove watched fds anytime.

Upvotes: 6

Related Questions