intrigued_66
intrigued_66

Reputation: 17248

Pushing code towards kernel or user space, for performance reasons?

Originally I thought to make code faster it would be better to try and reduce the transition between Kernel and user space- by pushing more of the code to run in the kernel. However, I have read in a few forums like SO that the opposite is actually done- more of the code is pushed into the user space. Why is this? It seems counter intuitive? Putting more of the code into the user space still requires kernel-user transitions, whereas putting the code in the kernel doesnt requite kernel-user transitions?

In case anyone asks- I am thinking about an application processing packet data.

EDIT

So more details, I am thinking about when packet data arrives- I want to re-write the network stack and cut out code which isn't applicable for my packet processing and have zero copy- putting the packet data somewhere where the user program can access it as quick as possible.

Upvotes: 2

Views: 1352

Answers (3)

Mppl
Mppl

Reputation: 961

Transitions from user-mode to kernel-mode take some time and resources, so keeping the code in only one of the modes may increase performance.

As mentioned: in your case probably the best option you have is to fetch the data as fast as possible and make it available in user-land right away and do the processing in user-land... moving all the processing to kernel-level seems to me unnecessary... Unless you have a good reason to do so... with no further information it seems to me you have no reason to believe you'll do it faster in kernel-mode than user-mode, all you could spare is a mode transition now and then, which shouldn't be relevant.

Upvotes: 1

Mike
Mike

Reputation: 49403

The kernel is a time sensitive area, it’s where your ISRs, time tick routines, and hardware critical sections reside. Because of this, the objective is to keep kernel code small and tight, get in, get your work done, and get out.

In your case you're getting packets from the network, that's a hardware dependent task (you need to get data from the lower network layers), so get your data, clear the buffers, and send it via a DMA transfer to user space; then do your processing in user space.

From my experiences: The preformance gained by executing your code in ther kernel will not outweigh the preformance lost overall by executing more code in the kernel.

Upvotes: 4

Mats Petersson
Mats Petersson

Reputation: 129374

If you expect your code to go into the official kernel release, "shuffling user mode parts of it into the kernel" is probably a bad idea as a rule.

Of course, if you can prove that by doing so is the BEST (subjective, I know) way to achieve better performance, and the cost is acceptable (in terms of extra code in kernel -> more burden of maintenance on the kernel, bigger kernel -> more complaints about kernel being "too big" etc), then by all means follow that route.

But in general, it's probably better to approach this by doing more work in user-mode, and make the kernel mode task smaller, if that is at all an alternative. Without knowing exactly what you are doing in the kernel and what you are doing in usermode, it's hard to say for sure what you should/shouldn't do. But for example batching up a dozen "items" into a block that is ONE request for the kernel to do something is a better option than calling the kernel a dozen times.

In response to your edit describing what you are doing: Would it not be better to pass a user-mode memory region to receive the data, and then just copy into that when the packet arrives. Assuming "all memory is equal" [if it isn't, you have problems with "in place use" anyway], this should work just as well, with less time spent in the kernel.

Upvotes: 1

Related Questions