Sriram
Sriram

Reputation: 150

Signal handler (segv) unable to complete before device crashes

I have installed a handler (say, crashHandler()) which has a bit of file output functionality. It is a linux thread which registers for SIGSEGV with the crashHandler(). File writing is requred, as it stores the stack trace to persistent storage.
It works most of the times. But in a specific scenario, the function (crashHandler()) executes the function partly (I can see logs) and then device reboots. Can someone help me with a way to deal with such ?

Upvotes: 1

Views: 347

Answers (1)

bdonlan
bdonlan

Reputation: 231143

The first question to ask here is why the device rebooted. Normally having an ordinary application crash won't cause a kernel-level or hardware-level reboot. Most likely, you're either hitting a watchdog timer before the crash handler completes (in which case you should extend the watchdog timeout - do NOT reset the timer from within the crash handler though, as then you're risking problems in the crash handler itself preventing a reboot), or this is pid 1 and it's crashing within the SIGSEGV handler, causing a kernel panic due to pid 1 (init) dying.

If it's the latter, you need to be more careful with what you do in that crash handler. Remember, you just crashed. You know memory is corrupt, but you don't know how it's corrupt. It may be corrupt in ways that affect the crash handler itself - e.g. if you corrupt the heap metadata, you may be unable to allocate memory without crashing for real this time. You should keep what you do in that handler to a bare minimum - in particular, avoid calling any library functions that are not documented as being async-signal-safe and avoid using any complex (pointer-containing) data structures or dynamically allocated memory. For the highest level of safety, limit yourself to just fork() and exec()ing another process that will use debugger APIs (ptrace() and /proc/$PID/mem) to perform memory dumps or whatever else you might need.

Upvotes: 1

Related Questions