Reputation: 1058
I want to understand the functioning of Python Interpreter. I understand the process of generation on opcode and want to better understand the interpreter part. For that I read a lot on internet and got to know about for (;;)
loop in ceval.c
file in python interpreter(Cpython).
Now I want to interpret the following python code a.py
:
a = 4
b = 5
c = a + b
when i do python -m dis a.py
1 0 LOAD_CONST 0 (4)
2 STORE_NAME 0 (a)
2 4 LOAD_CONST 1 (5)
6 STORE_NAME 1 (b)
3 8 LOAD_NAME 0 (a)
10 LOAD_NAME 1 (b)
12 BINARY_ADD
14 STORE_NAME 2 (c)
16 LOAD_CONST 2 (None)
18 RETURN_VALUE
Now I have put the debug point in switch(opcode)
line in ceval.c
. And now when i start the debugger it comes to this position for more than 2000 times. I think this is because before starting, python has to do some other interpretations stuff as well. So, my question is how do I debug only the relevant opcodes instructions?
Basically, how do i know the instruction I am debugging are actually from the program I created?
Please help me out with the same. Thanks in advance.
Upvotes: 5
Views: 960
Reputation: 1093
I do a lot of CPython debugging for better understanding the way it works. The lack of possibility to set a gdb breakpoint in Python source files I solved by writing a C extension module.
The idea: CPython is a big program written in C language. We can easy debug it as any C program - no problems here. If we want to stop execution when the _PyType_Lookup
function is started, we just run a break _PyType_Lookup
command. Thus, if we add our own C function into the CPython program, for example cbreakpoint
, we can stop execution every time the cbreakpoint
is called. And if we will find the way to insert this cbreakpoint
function into the source.py
, we will get the required functionality - every time the interpreter will see the cbreakpoint
, it will be stopped (if we set break cbreakpoint
before). We can do that by writing a C extension".
How I did that (I can miss something, because I am reproducing from memory):
~/learning_python/cpython-master
directory and compiled it. There were some intricacies - Can't get rid of “value has been optimized out” in GDB.my_breakpoint.c
.my_breakpoint_setup.py
.Run a
~/learning_python/cpython-master/python my_breakpoint_setup.py build
command. It created a my_breakpoint.cpython-38dm-x86_64-linux-gnu.so
file.
Copied the shared object file from previous step into CPython's Lib
directory:
cp -iv my_breakpoint.cpython-38dm-x86_64-linux-gnu.so ~/learning_python/cpython-master/Lib/
The copying is needed for convenience, otherwise we should have this .so
file in any directory we want use (import) this module.
Now, we can make a following source.py
:
#!/usr/bin/python3
from my_breakpoint import cbreakpoint
cbreakpoint(1)
a = 4
cbreakpoint(2)
b = 5
cbreakpoint(3)
c = a + b
To execute this file we must use our ~/learning_python/cpython-master
interpreter, not a system's python3
, because the system's python doesn't have the my_breakpoint
module:
~/learning_python/cpython-master/python source.py
To debug this file do:
gdb --args ~/learning_python/cpython-master/python -B source.py
Then, inside gdb
:
(gdb) start
(gdb) break cbreakpoint
Function "cbreakpoint" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (cbreakpoint) pending.
(gdb) cont
There is one problem. When you are pressing cont
, gdb
is stopped at the beginning of the cbreakpoint
function and you are needing to do many next
commands to skip this function and a CPython function calling code to achieve the beginning of the desired Python code execution. Or you can set a new breakpoint after cbreakpoint
was hitted, like:
(gdb) break ceval.c:1080 ### The LOAD_CONST case beginning
(gdb) cont
But, after doing this many times I were automating these actions, so you can just add these lines into your ~/.gdbinit:
set breakpoint pending on
break cbreakpoint
command $bpnum
tbreak ceval.c:1098
command $bpnum
n
end
cont
end
set breakpoint pending off
Now, you just start gdb as in the 7 step and do:
(gdb) start
(gdb) cont
and you will jumped to the beginning of the source.py
code execution.
my_breakpoint.c
#include <Python.h>
static PyObject* cbreakpoint(PyObject *self, PyObject *args){
int breakpoint_id;
if(!PyArg_ParseTuple(args, "i", &breakpoint_id))
return NULL;
return Py_BuildValue("i", breakpoint_id);
}
static PyMethodDef my_methods[] = {
{"cbreakpoint", cbreakpoint, METH_VARARGS, "breakpoint function"},
{NULL, NULL, 0, NULL}
};
static struct PyModuleDef my_breakpoint = {
PyModuleDef_HEAD_INIT,
"my_breakpoint",
"the module for setting C breakpoint in the Python source",
-1,
my_methods
};
PyMODINIT_FUNC PyInit_my_breakpoint(void){
return PyModule_Create(&my_breakpoint);
}
my_breakpoint_setup.py
from distutils.core import setup, Extension
module = Extension('my_breakpoint', sources = ['my_breakpoint.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a package for my_breakpoint module',
ext_modules = [module])
I asked the same question in the past, it can be useful for you: The optimal way to set a breakpoint in the Python source code while debugging CPython by GDB.
Upvotes: 4