Reputation: 3235
I read several Linux book and tutorials about signals, they all say kernel handles signals at the timing that kernel transitions from kernel to user mode. This makes total sense, until I saw and experimented the following code:
>cat sig_timing.c
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdbool.h>
volatile bool done = false;
static void sig_handler(int signo)
{
printf("Received signal %d (%s), current errno:%s\n", signo, strsignal(signo), strerror(errno));
done = true;
}
int main(int argc, char **argv)
{
signal(SIGALRM, sig_handler);
alarm(3);
while (!done) {
strlen("Hello World!");
}
return 0;
}
>gcc sig_timing.c
>./a.out
Received signal 14 (Alarm clock), current errno:Success
So the main function enters endless loop after registering the signal, the loop does not invoke any system call, so there is no chance to enter kernel, then there is no transition from kernel to user mode, then there should be no chance to invoke signal handler, right?
Later on, the talk host explained what's going on (I adapted a bit):
Sender kernel thread sends inter-cpu message to cause hardware interrupt on CPU running target process, causing it to enter the kernel to handle the interrupt, and return to user mode.
I am not so convinced: this explanation seems saying the signal sender and signal receiver run at 2 CPU hardware threads. But what about CPU without hyper threading? The process runs just at one CPU thread. In that case, does the signal get chance to get handled while user land code runs an endless loop?
Upvotes: 0
Views: 219
Reputation: 887
This will be an involved answer, so bear with me.
All modern OSs use pre-emptive scheduling (as contrasted to co-operative scheduling) where a hardware timer is programmed to raise a timer interrupt regularly (called a OS "tick") which then leads into the OS scheduler. Consequently, even if a user thread is running an infinite loop, every once in a while (maybe 10ms), the hardware will raise an exception, causing a trap into the kernel. This allows the OS to run the scheduler in order to decide which user task will be run next. In your specific example, this allows the kernel to not only schedule your program (sig_timing
), but to also notice when the userspace alarm should be fired.
Pre-emptive scheduling is the main reason why you can get an interactive multi-programming experience even on a single-core computer. The CPU is time-shared between tasks, with the scheduler getting run every tick in order to schedule the different tasks.
Notice that the OS timer period (10ms) is much smaller than the user alarm time (3s), and the OS can deliver the signal to the process at almost perfectly 3s. Another related consequence is that the OS can deliver the SIGALRM much later than 3s if there is another high-priority userspace task hogging the CPU, not allowing the OS to schedule your program.
In co-operative scheduling, tasks are required to explicitly yield control to the scheduler (often using a yield
-like call). On such a system, your program might never receive the signal.
Unlike some signals (SIGSEGV, SIGILL, SIGBUS), an alarm signal (SIGALRM) is not associated with a particular userspace instruction, and is not required to be "precise". In other words, the kernel does not need to be extremely precise about when it delivers the signal. Remember that a second is extremely long for a CPU core.
Precise exceptions, however, are caused by a particular instruction. SIGSEGV is caused by permission errors, SIGILL by illegal instructions, etc. Such instructions cause the hardware to raise a trap directly on the instruction, and the OS starts running. The OS handles the fault and forwards the corresponding signal to the application, with precisely the state on the instruction.
Upvotes: 1