What are all that zeros in python bytecode and how to compute them

Question

When I do list(some_function.__code__.co_code) I can see the actual bytecode of that function (in list[int] format). And I see that there are a lot of zeros - actually more than in earlier versions of python. Yeah, I've seen this question, but if I create that function in python 3.12, there are more zeros in the bytecode. My question is: "what do all that zeros mean and if I want to write bytecode, how can I compute how many zeros are needed?"

See what happens:

def f(x):
    return x + x/3

bytecode = list(f.__code__.co_code)
print(bytecode)

prints:

[151, 0, 124, 0, 100, 1, 124, 0, 122, 11, 0, 0, 122, 0, 0, 0, 83, 0]

def f(x):
    return x + x/3

dis.dis(f, show_caches=True)

gives:

  1           0 RESUME                   0
  2           2 LOAD_FAST                0 (x)
              4 LOAD_FAST                0 (x)
              6 LOAD_CONST               1 (3)
              8 BINARY_OP               11 (/)
             10 CACHE                    0 (counter: 0)
             12 BINARY_OP                0 (+)
             14 CACHE                    0 (counter: 0)
             16 RETURN_VALUE

This is different of the code in the already mentioned code at a couple of points:

First we have RESUME in front, this is for debug purposes, as mentioned in the official dis reference
Also it is slightly different when binary operations are done, instead of using one bytecode that stands for one operation, there is only one opcode, for all operations, with an argument of which operation to do.
The main difference is, however, that there are at some points less zeros, and at other points more...

What's going on here? And why use all opcodes one argument (also the one which opcode is less than dis.HAVE_ARGUMENT)

Where this is not super strange it gets somewhat stranger when dealing with the following function:

def f():
    print("hello world!")

bytecode: [151, 0, 116, 1, 0, 0, 0, 0, 0, 0, 0, 0, 100, 1, 171, 1, 0, 0, 0, 0, 0, 0, 1, 0, 121, 0]

Could someone also explain all these zeros?

Thanks in advance!

EDIT

I see that all that zeros are CACHE opcodes, but how to compute how many CACHEs are needed?

EDIT

There is a suggestion that the 0s in the bytecode are arguments not CACHE, but that assertion appears incorrect.

Looking at an annotated output of:

def f():
    print("hello world!")

print(list(f.__code__.co_code))

for instr in dis.Bytecode(f):
  print(instr.opname, instr.opcode, instr.arg)

One can see:

[151, 0, 116, 1, 0, 0, 0, 0, 0, 0, 0, 0, 100, 1, 171, 1, 0, 0, 0, 0, 0, 0, 1, 0, 121, 0]
   |  |    |  |                            |  |    |  |                    |  |    |  |
   |  |    |  |                            |  |    |  |                    |  |    |  |
   |  |    |  |                            |  |    |  |                    |  |    |  |
   |  |    |  |                            |  |    |  |                    |  |    |  |
   |--|    |--|---------|                  |  |    |  |                    |  |    |  |
   |--|---------------| |                  |  |    |  |                    |  |    |  |
                      | |                  |  |    |  |                    |  |    |  |
RESUME 151 0        --| |                  |  |    |  |                    |  |    |  |
LOAD_GLOBAL 116 1   ----|                  |  |    |  |                    |  |    |  |
LOAD_CONST 100 1    -----------------------|--|    |  |                    |  |    |  |
CALL 171 1          -------------------------------|--|                    |  |    |  |
POP_TOP 1 None      -------------------------------------------------------|--|    |  |
RETURN_CONST 121 0  ---------------------------------------------------------------|--|

Many of these 0 values not pointed to by the loop are indicated to be "CACHE" by dis.dis(f, show_caches=True)

What are all that zeros in python bytecode and how to compute them

Answers (1)

Related Questions