Reputation: 111
I am using Cheyenne for a relatively high-load web application. It works great and fast. But I have a problem that started appearing after upgrade to Ubuntu 14.04, or I started noticing it then because the load increased.
After few days of working, when a Rebol worker process should exit, the process starts to consume 100% CPU and "does nothing". I looked at process with strace and when it's in 100 CPU it doesn't call the OS in any way. I looked at the Cheyenne worker code (if there is any fault there) and the code executes OK to the Rebol command exit. This command makes it loop forever. It's the same if I try to kill the process with sigterm.
I can then kill it with sigkill. The process only gets into this state after few days of heavy load, and I haven't been able to replicate it in non-production environment or on local computer.
My naive thinking is that it loops forever while trying to clear it's memory before exiting, or maybe the open files / sockets. I looked the processes before/after with lsof (and similar), but since the event isn't easily reproducible haven't figured anything out yes.
My question is: has anyone seen Rebol2 go into eternal 100% loop on exit and under which circumstances? Does anyone have any idea about solving this?
Upvotes: 2
Views: 86
Reputation: 111
I've seen this on our production cheyenne servers, with 100% cpu not responding, probably after serving a very long file (lot of datas in the response)... Never managed to find time to diagnostic more this issue, ending writing a monitor in go that kills 100% cpu process for a too long time.
https://github.com/Softinnov/bearded-monitor
You can use it in a docker container
https://hub.docker.com/r/softinnov/bearded-monitor/
Hope it helps.
Upvotes: 2