Reputation: 126
I'm attempting to optimize a program I wrote which aims to replicate network flows by sending packets to a specified MAC address.
The main loop of my program that is responsible for the sending and removal of flows is as follows:
while (size != 0 || response) {
for (i = 0; size != 0 && i < size; ++i) {
curFlow = *pCurFlow;
while (curFlow.cur_time < now) {
// Sending Packet
sendto(sockfd, curFlow.buff, curFlow.length, 0, \
memAddr, sAddrSize);
// Adjusting Packet Attributes
curFlow.packets_left -= 1;
curFlow.cur_time += curFlow.d_time;
// If the packet has no packets left, delete it
if (!curFlow.packets_left) {
pCurFlow -> last -> next = pCurFlow -> next;
pCurFlow -> next -> last = pCurFlow -> last;
size -= 1;
break;
}
}
*pCurFlow = curFlow;
pCurFlow = pCurFlow -> next;
}
}
I've begun using the perf profiler to record what sort of function calls I'm making and how expensive each overhead is. However, every time I ask perf to give me a report, the outcome looks like:
Overhead Command Shared Object Symbol
15.34% packetize /proc/kcore 0x7fff9c805b73 k [k] do_syscall_64
6.19% packetize /proc/kcore 0x7fff9d20214f k [k] syscall_return_via_sysret
5.98% packetize /proc/kcore 0x7fff9d1a3de6 k [k] _raw_spin_lock
5.29% packetize /proc/kcore 0x7fffc0512e9f k [k] mlx4_en_xmit
5.26% packetize /proc/kcore 0x7fff9d16784d k [k] packet_sendmsg
(Note: "packetize" is the name of my program)
My question is, what the heck is "do_syscall_64"?? After conducting some research, it seems like this particular function is a kernel tool used as an interrupt request.
Furthermore, I've found that the directory /proc/kcore is responsible for some components of memory management, although upon purposefully ransacking my program with memory references, the dynamic library I use with my program was the only overhead that increased from perf report.
Please let me know if you have any advice for me. Thank you!
Upvotes: 1
Views: 2789
Reputation: 366016
It's not an interrupt request; it's the C function called from the syscall
entry point that dispatches to the appropriate C function that implements the system call selected by a register passed by user-space.
Presumably sys_sendto
in this case.
In older versions of Linux, the x86-64 syscall
entry point used the system-call table of function pointers directly (e.g. as shown in this Q&A where only the 32-bit entry points like for int 0x80
used a C wrapper function).
But with the changes for Spectre and Meltdown mitigation, the native 64-bit system call entry point (into a 64-bit kernel from 64-bit user-space) also uses a C wrapper around system call dispatching. This allows using C macros and gcc hints to control speculation barriers before the indirect branch. The current Linux version of do_syscall_64
on github is a pretty simple function; it's somewhat surprising it's getting so many cycles itself unless nr = array_index_nospec(nr, NR_syscalls);
is a lot more expensive than I'd expect on your CPU.
There's definitely expensive stuff that happens in the hand-written-asm syscall entry point, e.g. writing the MSR that flushes the branch-prediction cache. Oh, maybe lack of good branch prediction is costing extra cycles in the first C function called after that.
System-call intensive workloads suffer a lot from Spectre / Meltdown mitigations. It might be interesting to try booting with some of them disabled, and/or with an older kernel that doesn't have that code at all.
Meltdown / L1TF / etc. are completely fixed in the newest Intel CPUs with no performance cost, so disabling workarounds for that might give you some clue how much benefit you'd get from a brand new CPU.
(Spectre is still a very hard problem and can't be easily fixed with a local change to the load ports. IDK what the details are of how efficient various mitigation microcode-assisted or not strategies for mitigating it are on various CPUs.)
Upvotes: 5