urig
urig

Reputation: 16831

Python: How to debug high CPU consumption on Linux?

I have a Python application that I am developing on Windows. I need to investigate a complaint that my app consumes 100% CPU when running in Linux.

I was able to reproduce the issue by:

  1. Running a Docker container based on Debian (python:3.9.13-buster).
  2. Cloning my git repo into the container.
  3. Building my app (poetry ftw) and running it.
  4. Running top to see that a python process is indeed consuming 100% CPU (and oddly enough occasionally more).

How do I debug my app in the container to understand which part of my code is misbehaving? Is there a utility that can profile my python program as it's running to pinpoint the offending code?

PS - I've tried using gdb -p <thread_id> and its bt command but I can't make sense of the stack trace printed.

Upvotes: 1

Views: 1645

Answers (1)

urig
urig

Reputation: 16831

Answering my own question in the hope others might find it useful in the future.

I've been able to use py-spy's top command on my program as it was running inside the container and its output has showed me that the hupper library is the cause of the high CPU usage.

Here are the steps that I've taken:

  1. I had to spin up a new container with the SYS_PTRACE capability otherwise py-spy can't run inside it:
    docker run -it --cap-add SYS_PTRACE python:3.9.13-buster /bin/bash
    
  2. I cloned my app into the container, built it, and ran it.
  3. I ran top to see what the process id is for my program (identifiable by its high CPU usage).
  4. I ran py-spy top --pid <process id> to see a live table showing me the most active parts of my code.

Similar to this:

(python v3.9.13)
Total Samples 78500
GIL: 15.00%, Active: 103.00%, Threads: 4

  %Own   %Total  OwnTime  TotalTime  Function (filename)
 92.00%  92.00%   717.0s    717.0s   get_mtime (hupper/polling.py)
  3.00%   3.00%   41.27s    67.08s   walk (dash/_watch.py)
  7.00%  99.00%   35.86s    752.9s   check_reload (hupper/polling.py)
  1.00% 100.00%   20.21s    773.1s   run (hupper/polling.py)
  0.00%   0.00%   17.08s    18.78s   _walk (os.py)
  0.00%   0.00%    4.24s     4.24s   <listcomp> (dash/_watch.py)
  0.00%   0.00%    2.49s     2.89s   join (posixpath.py)
  0.00%   0.00%    1.60s     1.60s   islink (posixpath.py)
  0.00%   0.00%   0.400s    0.400s   _get_sep (posixpath.py)
  0.00%   3.00%   0.210s    67.29s   watch (dash/_watch.py)
  0.00%   0.00%   0.030s    0.030s   _run_worker (hupper/reloader.py)
  0.00%   0.00%   0.020s    0.020s   _recv_packet (hupper/ipc.py)
  0.00%   3.00%   0.000s    67.31s   run (threading.py)
  0.00%   0.00%   0.000s    0.030s   start_reloader (hupper/reloader.py)
  0.00%   0.00%   0.000s    0.030s   <module> (<string>)
  0.00%   3.00%   0.000s    67.29s   <lambda> (dash/dash.py)
  0.00%   0.00%   0.000s    0.030s   serve (myapp/cli/main.py)
  0.00% 103.00%   0.000s    840.4s   _bootstrap (threading.py)
  0.00%   0.00%   0.000s    0.030s   serve_dashboard (myapp/dashboard/__init__.py)
  0.00%   0.00%   0.000s    0.030s   wrapper (myapp/cli/main.py)
  0.00%   0.00%   0.000s    0.020s   _read_loop (hupper/ipc.py)
  0.00%   0.00%   0.000s    0.030s   main (myapp/cli/main.py)
  0.00%   0.00%   0.000s    0.030s   run (hupper/reloader.py)

Upvotes: 2

Related Questions