goodolddays
goodolddays

Reputation: 2683

multithreaded epoll

I am creating a multithreaded server using epoll (edge-triggered) and non-blocking sockets. Currently I'm creating an event loop on the main thread and waiting for notifications and it works correctly
I have to choose between two approaches to make it multithreaded:

  1. Create an event loop for each thread and add the server socket's file descriptor to look for notifications on each thread. (is that possible? I mean: is epoll thread-safe?)
  2. Create a single event loop and wait for notifications. Whenever a notification is received, spawn a thread to handle it.

If I use the first method, is there a chance for multiple threads to get notified with the same event? how can I handle this situation?

What could be the best approach? Thank you.

Upvotes: 14

Views: 18698

Answers (6)

Eric
Eric

Reputation: 1259

Every answer says epoll is thread-safe but that's not true.

epoll calls are thread safe at the most basic level: when considering the list of file descriptors, the events, etc. And especially using EPOLLET will allow only only one thread to receive an event whereas level-triggered epoll will potentially deliver the same event to multiple threads, which causes the thundering herd other mentioned elsewhere in this page.

However it's not thread safe at the scheduling level: when a thread receives an event (say with EPOLLET it's woken up for the first packet that comes from the network), then while it's processing other file descriptors in the long list of events, the next packet on the network arrived and triggers another event that another thread picks-up. Now you've got two threads concurrently running to retrieve data from the network.

One can imagine several scenarios where this could break an app, which depend on the read patterns. In addition to this race, OS scheduling can mess with the timings and favor one of the two threads which is problematic when a thread needs multiple reads (e.g. it first reads the size then the payload but got time-sliced in-between).

Even if the first thread reads everything, the second thread will read nothing (unless data has arrived from the network between the two reads). Say this "everything" contains half an http request. While it parses that half request, the 2nd half arrived on the network and another thread got an event and reads the second half. How do you handle that without synchronization?

The approach you describe of using the same socket in multiple threads is only thread safe if you can ensure two things:

  1. Reads will atomically retrieve a number of complete messages which is near impossible to ensure, due to how networking works (path mtu, TCP window...). In practice that only works with fixed length messages.
  2. Messages can be processed out of order.

Upvotes: 0

jingchun.zhang
jingchun.zhang

Reputation: 81

epoll is thread-safe

hope the following code can help your

https://github.com/jingchunzhang/56vfs/blob/master/network/vfs_so_r.c

Upvotes: 0

CodeSun
CodeSun

Reputation: 41

I'm also writing a server using epoll, and I have considered the same model as you attached.

It is possible to use option 1, but it may cause "thundering herd" effect, you can read the source of nginx to find the solution. As for option 2, I deem that it is better to use thread pool instead of spawning a new thread each time.

And you can also the following model:

Main thread/process: accept incoming connection with blocking IO, and send the fd to the other threads using BlockingList or to the other processes using PIPE.

Sub threads/process: create an instance of epoll respectively, and add the incoming fd to the epoll, then processing them with non-blocking IO.

Upvotes: 2

Bert Young
Bert Young

Reputation: 21

an event loop for each thread is the most flexible with high performance You should create an epoll fd for each event loop, there is no concern about epoll thread-safe problem.

Upvotes: 1

edsiper
edsiper

Reputation: 416

epoll is thread safe, a good solution is that your main process stay in accept(2), once you get the file descriptor register that one in the epoll fd for the target thread, that means that you have a epoll queue for each thread, once you create the thread you share the epoll file descriptor as parameter in the call pthread_create(3), so when a new connection arrives, you do a epoll_ctl(...EPOLL_CTL_ADD..) using the epoll fd for the target thread and the new socket created after accept(2), make sense ?

Upvotes: 0

solofox
solofox

Reputation: 251

I think option 1 is more popular since the primary purpose of non-blocking IO is to avoid the overhead of create & destroy threads.

take the popular web server nginx as an example, it create multiple processes (not threads) to handle incoming events on a handle, and the process the events in the subprocess. all of them share the same listening socket. it's quite similar to option 1.

Upvotes: 7

Related Questions