MiniMax
MiniMax

Reputation: 1093

The function object and the code object relation in CPython

The Include/funcobject.h of the CPython source code starts with next comment:

Function objects and code objects should not be confused with each other:

Function objects are created by the execution of the 'def' statement. They reference a code object in their __code__ attribute, which is a purely syntactic object, i.e. nothing more than a compiled version of some source code lines. There is one code object per source code "fragment", but each code object can be referenced by zero or many function objects depending only on how many times the 'def' statement in the source was executed so far.

that I don't quite understand.


I write my partial understanding here. May be someone complete it.

  1. Compilation stage.

    We have source file Test.py:

    def a_func():
        pass
    

    The interpreter parses it and creates two code objects - one for Test.py and one for a_func. The Test.py code object has such co_code field (disassembled):

      3           0 LOAD_CONST               0 (<code object a_func at 0x7f8975622b70, file "test.py", line 3>)
                  2 LOAD_CONST               1 ('a_func')
                  4 MAKE_FUNCTION            0
                  6 STORE_NAME               0 (a_func)
                  8 LOAD_CONST               2 (None)
                 10 RETURN_VALUE
    

    No function object creating at this stage.

  2. Execution stage.

    • Function objects are created by the execution of the 'def' statement.

    When the virtual machine reach the MAKE_FUNCTION instruction, it creates the function object:

    typedef struct {
        PyObject_HEAD
        PyObject *func_code;        /* A code object, the __code__ attribute */
        PyObject *func_globals;     /* A dictionary (other mappings won't do) */
        PyObject *func_defaults;    /* NULL or a tuple */
        PyObject *func_kwdefaults;  /* NULL or a dict */
        PyObject *func_closure;     /* NULL or a tuple of cell objects */
        PyObject *func_doc;         /* The __doc__ attribute, can be anything */
        PyObject *func_name;        /* The __name__ attribute, a string object */
        PyObject *func_dict;        /* The __dict__ attribute, a dict or NULL */
        PyObject *func_weakreflist; /* List of weak references */
        PyObject *func_module;      /* The __module__ attribute, can be anything */
        PyObject *func_annotations; /* Annotations, a dict or NULL */
        PyObject *func_qualname;    /* The qualified name */
    } PyFunctionObject;
    
    • They reference a code object in their __code__ attribute, which is a purely syntactic object, i.e. nothing more than a compiled version of some source code lines.

    and puts the a_func code object into the PyObject *func_code field. Now, the message from the comment "The function object and the code object are not the same" is clear.

    • There is one code object per source code "fragment", but each code object can be referenced by zero or many function objects depending only on how many times the 'def' statement in the source was executed so far.

    The part that I don't understand emphasized by strong font.

Upvotes: 1

Views: 246

Answers (1)

Davis Herring
Davis Herring

Reputation: 40013

If I make a lambda factory (a good idea for scope reasons):

def mk_const(k):
  def const(): return k
  return const

then there is one code object for mk_const and one for const, but there are many function objects for the latter as calls to mk_const (including 0).

(It makes no difference to use a lambda, but it’s easier to explain with def.)

It can also be the result of if:

if lib.version>=4:
  def f(x): return lib.pretty(x)
else:
  def f(x): return str(x)  # fallback

There are two code objects here (plus the one for the module), but at most one of them gets used.

Upvotes: 1

Related Questions