Jay
Jay

Reputation: 59

Is this a Linux Kernel Crash? How do I resolve it?

We are testing a firewall application running on embedded linux. At a certain point during testing, the linux hangs(freezes) and we see the following on the console:

TCHDOG: eth0 (fsl-gianfar): transmit queue 0 timed out
------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm:
swapper/0 Not tainted 3.12.19-rt30-gc29fe1a #27 task: c08f9300 ti:
effea000 task.ti: c093a000 NIP: c052a98c LR: c052a98c CTR: c0327948
REGS: effebe60 TRAP: 0700   Not tainted  (3.12.19-rt30-gc29fe1a) MSR:
00029000 <CE,EE,ME>  CR: 44044022  XER: 20000000

GPR00: c052a98c effebf10 c08f9300 0000003f c128c484 c128c9d0 c0328b54
00021000  GPR08: 00000001 00000001 0099b000 00000312 24044024 0f003103
effea000 c07f6f28  GPR16: 00000100 00200000 c0940000 c08f0000 001631a5
00000000 000000a4 ffffffff  GPR24: 00000000 00000000 effea000 00000004
c0940000 c0940000 c74d0000 00000000  NIP [c052a98c]
dev_watchdog+0x2dc/0x2ec LR [c052a98c] dev_watchdog+0x2dc/0x2ec Call
Trace: [effebf10] [c052a98c] dev_watchdog+0x2dc/0x2ec (unreliable)
[effebf40] [c005194c] call_timer_fn.isra.29+0x28/0x84 [effebf60]
[c0051b28] run_timer_softirq+0x180/0x1fc [effebfa0] [c004a5e8]
__do_softirq+0x100/0x1cc [effebff0] [c000d6e8] call_do_softirq+0x24/0x3c [c093be60] [c0004920] do_softirq+0x90/0xb8
[c093be80] [c004afb4] irq_exit+0xa4/0xc8 [c093be90] [c0009c10]
timer_interrupt+0x1a4/0x1d0 [c093bec0] [c000f594]
ret_from_except+0x0/0x18
--- Exception: 901 at arch_cpu_idle+0x24/0x5c
    LR = arch_cpu_idle+0x24/0x5c [c093bf80] [c00ac4ec] rcu_idle_enter+0xac/0xec (unreliable) [c093bf90] [c0086b00]
cpu_startup_entry+0x120/0x170 [c093bfc0] [c08a97a8]
start_kernel+0x2f0/0x304 [c093bff0] [c00003fc] skpinv+0x2e8/0x324
Instruction dump: 4e800421 80fe0204 4bffff44 7fc3f378 4bfe72e5
7fc4f378 7c651b78 3c60c085  7fe6fb78 38632bf0 4cc63182 48184835
<0fe00000> 39200001 993c9c37 4bffffb4 
---[ end trace d3f58d6e7db83823 ]---

Is it a kernel crash? What caused it? How do I resolve it? Please let me know if you need any other information.

Upvotes: 1

Views: 2890

Answers (2)

gby
gby

Reputation: 15218

No, it isn't a kernel crash.

It's a warning notification from an internal watchdog timer that watches over the transmit work of the Freescale Gianfar Ethernet driver.

The message means the drivers has queued a frame(s) for transmission and timeout getting a transmit confirmation interrupt (or other indication) from the Ginafar hardware that they were transmitted.

This may be a driver issue - but it can very well be a hardware issue (e.g. Ethernet MAC getting stuck).

BTW, the content of the message says your system was not doing anything (being idle) at the time the watchdog timer happened.

Upvotes: 3

Sebb
Sebb

Reputation: 911

Since we/I don't know, what exactly you're doing without digging in your code. However, here's a try to analyze it a little ;)

The line WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-gc29fe1a shows next to some kernel data (not tainted means that you didn't load closed-source drivers) that the crash occured here. The stack trace verifys that this is the cause, too. While this line isn't too helpful per-se (for me, I'm not into the kernel source), it shows that the net scheduler failed. If your firewall somehow messed with it, you should start to search there.

If not, you may have encountered an actual kernel bug. The first thing to do is updating your version, if possible. There is 3.19 and 4.1 available as of writing. If this doesn't help (or you really need this version) you can file a kernel bug. Since your kernel isn't tainted, you can expect help from the devs. Good luck :)

Upvotes: 1

Related Questions