William Breathitt Gray
William Breathitt Gray

Reputation: 11996

Inherent race condition in Linux IRQ handlers

Suppose there is an port-mapped I/O device which arbitrarily generates interrupts on an IRQ line. The device's pending interrupts may be cleared via a single outb call to a particular register.

Furthermore, suppose the follow interrupt handler is assigned to the relevant IRQ line via request_irq:

irqreturn_t handler(int irq, void *data)
{
        /* clear pending IRQ on device */
        outb(0, CLEAR_IRQ_REGISTER_ADDR);

        /* device may generate another IRQ at this point,
         * but this handler function has not yet returned */

        /* signal kernel that IRQ has been handled */
        return IRQ_HANDLED;
}

Is there an inherent race condition in this IRQ handler? For example, if the device generates another interrupt after the "clear IRQ" outb call, but before the handler function returns IRQ_HANDLED, what will happen?

I can think of three scenarios:

  1. IRQ line freezes and can no longer be handled due to deadlock between the device and Linux kernel.
  2. Linux kernel executes handler again immediately after return, in order to handle second interrupt.
  3. Linux kernel interrupts handler with second call to handler.

Upvotes: 4

Views: 823

Answers (2)

Alexandre Belloni
Alexandre Belloni

Reputation: 2314

Scenario 2 is the correct one. Interrupts handlers are running with interrupts disabled on the local CPU. So after returning from your handler, the interrupt controller will see that another interrupt occured and your handler will get called again.

What may happen though is that you may miss some interrupts if your are not fast enough and multiple interrupts happen while your are still handling the first one. This should not happen in your case because you have to clear the pending interrupt.

Andy's answer is about another issue. You definitively have to lock access to your device and resources because your handler may run concurrently on different CPUs.

Upvotes: 5

0andriy
0andriy

Reputation: 4709

On SMP systems there is clearly a possibility to have a race. Interrupts are local to the CPU since most of them implementing LAPIC controllers. Thus, you have to protect your data and device access by critical section synchronization algorithm. Due to interrupt context most of suitable here is spin_lock_irqsave().

Upvotes: 1

Related Questions