Feru
Feru

Reputation: 1221

How to diagnose a python process chewing CPU in linux

My python process at certain point in automated scripts starts chewing CPU on Linux based System (Ubuntu). I’m trying to debug this issue in GDB. I'm fairly new to GDB. Are there any GDB commands to give information on which thread is using most of the cpu. Looking at the thread stack doesn't really give that away.

On windows windbg world the command '!runaway' did give the info on time consumed by each thread in a process. Do we've an equivalent command here ? Any other suggestions to debug issue ?

Upvotes: 8

Views: 9642

Answers (3)

Feru
Feru

Reputation: 1221

Just to clarify all the steps required to diagnose this issue. (thanks everyone for postings) :

Following command shows the list of process with their CPU / Memory usage :

$ ps auxf 

Following command gives the list of all threads of a process sorted with CPU usage:

$ top -H -p [PID]

*PID     USER   PR  NI  VIRT  RES  SHR S  %CPU    %MEM    TIME+  COMMAND*
**1654** root   20   0 1416m 1.2g  24m t  **100** 36.8  21:26.23 python
1687     root   20   0 1416m 1.2g  24m t    0     36.8   0:05.07 python

Thread 1654 is chewing CPU. Attach gdb to the process:

$ gdb /path/of/executable [pid]

Following command in gdb to get list of threads:

(gdb) info threads

2  Thread 0xa7bffb40 (LWP 20736)    "python" 0xb7736424 in __kernel_vsyscall ()
1  Thread 0xb73a56c0 (LWP **1654**) "python" 0xb7736424 in __kernel_vsyscall ()

In gdb switch to the thread to check its stack:

(gdb) thread 1
(gdb) bt

Upvotes: 18

hackerb9
hackerb9

Reputation: 1912

Short answer

  1. $ top -H

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    1654 root 20 0 1416m 1.2g 24m R 100 36.8 21:26.23 python
    1687 root 20 0 1416m 1.2g 24m S 0 36.8 0:05.07 python
  2. q

  3. $ gdb -p 1654

  4. (gdb) bt

Long answer

top -H

Run top -H to show the threads consuming the most CPU on your machine. (Without -H, top normally shows a single PID for each group of threads.) If you have a runaway thread, you'll see it at the top of the list. Note its PID.¹

*PID     USER   PR  NI  VIRT  RES  SHR S  %CPU    %MEM    TIME+  COMMAND*
**1654** root   20   0 1416m 1.2g  24m R  **100** 36.8  21:26.23 python
1687     root   20   0 1416m 1.2g  24m S    0     36.8   0:05.07 python

Back to the prompt

Hit q to quit top. (Alternately, you may open a second terminal window.)

gdb -p [pid]

Run gdb using the -p option to tell it to attach to the PID of the thread you want to diagnose. You will get a (gdb) prompt.

$ gdb -p 1654
(gdb) 

bt

At the (gdb) prompt, type bt to see the backtrace. The most useful gdb commands for examining the call stack and variables are: up, down, list, and p. (I also highly recommend set print pretty on so the output from p is easier to read.)

(gdb) bt
#0  0x000000000057279a in std::_Hashtable<terminal::renderer::ImageFragmentKey, std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata>, std::allocator<std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata> >, std::__detail::_Select1st, std::equal_to<terminal::renderer::ImageFragmentKey>, std::hash<terminal::renderer::ImageFragmentKey>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node(unsigned long, terminal::renderer::ImageFragmentKey const&, unsigned long) const
    (this=this@entry=0x1c11780, __bkt=__bkt@entry=69247, __k=..., __code=11352315644114232719) at /usr/include/c++/10/bits/hashtable.h:1577
#1  0x0000000000573638 in std::_Hashtable<terminal::renderer::ImageFragmentKey, std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata>, std::allocator<std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata> >, std::__detail::_Select1st, std::equal_to<terminal::renderer::ImageFragmentKey>, std::hash<terminal::renderer::ImageFragmentKey>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_node(unsigned long, terminal::renderer::ImageFragmentKey const&, unsigned long) const
    (__c=11352315644114232719, __key=..., __bkt=69247, this=0x1c11780) at /usr/include/c++/10/bits/hashtable.h:693
#2  std::_Hashtable<terminal::renderer::ImageFragmentKey, std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata>, std::allocator<std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata> >, std::__detail::_Select1st, std::equal_to<terminal::renderer::ImageFragmentKey>, std::hash<terminal::renderer::ImageFragmentKey>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find(terminal::renderer::ImageFragmentKey const&) (__k=..., this=0x1c11780) at /usr/include/c++/10/bits/hashtable.h:1454
#3  std::unordered_map<terminal::renderer::ImageFragmentKey, terminal::renderer::ImageRenderer::Metadata, std::hash<terminal::renderer::ImageFragmentKey>, std::equal_to<terminal::renderer::ImageFragmentKey>, std::allocator<std::pair<terminal::renderer::ImageFragmentKey const, terminal::renderer::ImageRenderer::Metadata> > >::find(terminal::renderer::ImageFragmentKey const&) (__x=..., this=0x1c11780) at /usr/include/c++/10/bits/unordered_map.h:920
...

Minor caveat

If you attach to an individual thread using gdb -p as I suggest here, gdb will act as if there is only one thread. You must attach to the main process's PID if you wish to use the thread command to switch which thread you are debugging.


¹ Why do I say "PID" (process ID) instead of "TID" (thread ID)? Because, in Linux, threads are simply Lightweight Processes and use the same internal structure in the kernel as processes. Every new thread created has a different PID from the parent PID. To group them logically as a single "process", each thread has another field called TGID (Thread Group ID) which remembers the parent PID.

Upvotes: 1

Toast_Roast
Toast_Roast

Reputation: 21

One possible solution is to use the command top with the option to display all threads:

> top -H

The tasks will be sorted by CPU usage by default.

Alternate solutions can be found in the previous thread here.

Upvotes: 2

Related Questions