Reputation: 22647
I have an Android application that uses UDP broadcast (multicast actually) to communicate messages between devices. In my stress testing I have a setup where I have multiple devices sending to each other and receiving responses from each other, all over UDP. At most each device is sending a datagram packet every ~5ms. Since I have 5 devices sending this means a device could be receiving packets up to every ~1ms.
The problem is this causes my devices to reboot, usually within about 10 seconds of starting the tests. Typically 1-3 of the 5 devices under test will reboot.
[ 131.986546] sched: RT throttling activated for rt_rq ffffffc0ac098e50 (cpu 1)
[ 131.986546] potential CPU hogs:
[ 131.986546] msm_thermal:fre (307)
[ 132.000113] ------------[ cut here ]------------
[ 132.004692] kernel BUG at XXX/kernel/kernel/sched/rt.c:866!
[ 132.013583] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 132.019101] Modules linked in: XXX core_ctl(PO) qdrbg_module(O) qcrypto_module(O)
[ 132.026953] CPU: 1 PID: 307 Comm: msm_thermal:fre Tainted: P W O 3.10.84-g2e3fe32-00061-gf40a46d #6
Looking at rt.c in the kernel source for one device, it crashes on BUG() below:
#ifdef CONFIG_PANIC_ON_RT_THROTTLING
/*
* Use pr_err() in the BUG() case since printk_sched() will
* not get flushed and deadlock is not a concern.
*/
pr_err("%s", buf);
BUG();
#else
printk_deferred("%s", buf);
#endif
We do have CONFIG_PANIC_ON_RT_THROTTLING
enabled in our kernel. I tried simply disabling that (FWIW Google's kernel configs for its devices have this disabled) and the system still crashes, but at some other point... with absolutely nothing in the kernel log it just cuts off abruptly.
Amazingly this happens across different versions of Android from 5.1 to 8.1, obviously with different Linux kernel versions. All devices are running Qualcomm CPUs (the addition of CONFIG_PANIC_ON_RT_THROTTLING
was added by Qualcomm).
Just analyzing this with some common sense, UDP packet delivery must be handled by some RT scheduler... and the fact that the system is needing to throttle this implies that I'm loading the CPU beyond it's capacity... which is hard to believe is possible on a modern CPU by sending UDP packets (and of course, why that'd panic the kernel).
Upvotes: 4
Views: 3443
Reputation: 239
I see that question was asked long time ago. But I have same issue on the 4.9.241 kernel on the Khadas VIM3 with application that heavy utilize serial port.
Next WA was helped me:
# Mitigate RT throttling issue (that cause kernel panic on Khadas)
echo -1 >/proc/sys/kernel/sched_rt_runtime_us
echo -1 >/proc/sys/kernel/sched_rt_period_us
Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-rt-group.txt
Ref: https://doc.opensuse.org/documentation/leap/archive/42.1/tuning/html/book.sle.tuning/cha.tuning.taskscheduler.html
Upvotes: 1