Jeffrey Blattman
Jeffrey Blattman

Reputation: 22647

"RT throttling activated" + kernel panic when flooding UDP packets

I have an Android application that uses UDP broadcast (multicast actually) to communicate messages between devices. In my stress testing I have a setup where I have multiple devices sending to each other and receiving responses from each other, all over UDP. At most each device is sending a datagram packet every ~5ms. Since I have 5 devices sending this means a device could be receiving packets up to every ~1ms.

The problem is this causes my devices to reboot, usually within about 10 seconds of starting the tests. Typically 1-3 of the 5 devices under test will reboot.

[  131.986546] sched: RT throttling activated for rt_rq ffffffc0ac098e50 (cpu 1)
[  131.986546] potential CPU hogs:
[  131.986546]  msm_thermal:fre (307)
[  132.000113] ------------[ cut here ]------------
[  132.004692] kernel BUG at XXX/kernel/kernel/sched/rt.c:866!
[  132.013583] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  132.019101] Modules linked in: XXX core_ctl(PO) qdrbg_module(O) qcrypto_module(O)
[  132.026953] CPU: 1 PID: 307 Comm: msm_thermal:fre Tainted: P        W  O 3.10.84-g2e3fe32-00061-gf40a46d #6

Looking at rt.c in the kernel source for one device, it crashes on BUG() below:

#ifdef CONFIG_PANIC_ON_RT_THROTTLING
    /*
     * Use pr_err() in the BUG() case since printk_sched() will
     * not get flushed and deadlock is not a concern.
     */
    pr_err("%s", buf);
    BUG();
#else
    printk_deferred("%s", buf);
#endif

We do have CONFIG_PANIC_ON_RT_THROTTLING enabled in our kernel. I tried simply disabling that (FWIW Google's kernel configs for its devices have this disabled) and the system still crashes, but at some other point... with absolutely nothing in the kernel log it just cuts off abruptly.

Amazingly this happens across different versions of Android from 5.1 to 8.1, obviously with different Linux kernel versions. All devices are running Qualcomm CPUs (the addition of CONFIG_PANIC_ON_RT_THROTTLING was added by Qualcomm).

Just analyzing this with some common sense, UDP packet delivery must be handled by some RT scheduler... and the fact that the system is needing to throttle this implies that I'm loading the CPU beyond it's capacity... which is hard to believe is possible on a modern CPU by sending UDP packets (and of course, why that'd panic the kernel).

Upvotes: 4

Views: 3443

Answers (1)

Monah Tuk
Monah Tuk

Reputation: 239

I see that question was asked long time ago. But I have same issue on the 4.9.241 kernel on the Khadas VIM3 with application that heavy utilize serial port.

Next WA was helped me:

# Mitigate RT throttling issue (that cause kernel panic on Khadas)
echo -1 >/proc/sys/kernel/sched_rt_runtime_us
echo -1 >/proc/sys/kernel/sched_rt_period_us

Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-rt-group.txt
Ref: https://doc.opensuse.org/documentation/leap/archive/42.1/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html

Upvotes: 1

Related Questions