Reputation: 157
I’m working on a packet reshaping project in Linux using the BeagleBone Black. Basically, packets are received on one VLAN, modified, and then are sent out on a different VLAN. This process is bidirectional - the VLANs are not designated as being input-only or output-only. It’s similar to a network bridge, but packets are altered (sometimes fairly significantly) in-transit.
I’ve tried two different methods for accomplishing this:
The second option appears to be around 4 times faster than the first option. I’d like to understand more about why this is. I’ve tried brainstorming a bit and wondered if there is a substantial performance hit in rapidly switching between kernel and user space, or maybe something about the socket interface is inherently slow?
I think the user application is fairly optimized (for example, I’m using PACKET_MMAP), but it’s possible that it could be optimized further. I ran perf on the application and noticed that it was spending a good deal of time (35%) in v7_flush_kern_dcache_area, so perhaps this is a likely candidate. If there are any other suggestions on common ways to optimize packet processing I can give them a try.
Upvotes: 0
Views: 1363
Reputation: 862
The performance of the user space application depends on the used syscall to monitor the sockets too. The fastest syscall is epoll() when you need to handle a lot of sockets. select() will perform very poor, if you handle a lot of sockets.
See this post explaining it: Why is epoll faster than select?
Upvotes: 0
Reputation: 9770
Context switches are expensive and kernel to user space switches imply a context switch. You can see this article for exact numbers, but the stated durations are all in the order of microseconds.
You can also use lmbench to benchmark the real cost of context switches on your particular cpu.
Upvotes: 1