Kikapi
Kikapi

Reputation: 369

The process of calling python library function

I'm trying to understand the process of calling python function, I created simple .pyc file that calls os.listdir('.'), I saw that os and listdir are saved in the co_names table, When executing CALL_FUNCTION bytecode instruction, how os library is being identified? is it by its name using the co_names table? is python start searching for module named os.pyc? if so how python knows where is the offset of the bytecode of the function called in the .pyc module?

Thanks.

dis module bytecode snippet

  5          28 LOAD_NAME                0 (os)
             31 LOAD_ATTR                2 (listdir)
             34 LOAD_CONST               3 ('.')
             37 CALL_FUNCTION            1

Upvotes: 3

Views: 311

Answers (1)

kindall
kindall

Reputation: 184221

Python's virtual machine is stack-based. References to Python objects are pushed onto the stack, and an opcode pulls one or more of these off, does some operation, and usually pushes the result back on the stack for use by the next opcode.

As an aside, you might find disassembling a simple arithmetic calculation interesting (the operations must be completely reordered to work in this format). Or read up on FORTH; Python's VM is not dissimilar to FORTH's, but the actual FORTH language reflects its VM in a way that Python's does not. Anyway, on with the explanation...

The LOAD_NAME opcode gets a reference to the os object. (It happens to be a module, but it doesn't matter what kind of object it is, it works the same with all kinds of objects.) The reference is placed on top of the stack.

(This doesn't search for or load the module. Python has already imported a reference to os with a previous import statement, and is merely retrieving this reference from the global variables.)

The LOAD_ATTR opcode gets a reference to the listdir object of whatever object is referenced at the top of the stack. (Again, this reference is a function, but that doesn't matter.) The object reference at the top of the stack is popped off and the result of the LOAD_ATTR is pushed on.

The LOAD_CONST opcode gets a reference to the string '.' and pushes it on top of the stack.

Now the CALL_FUNCTION pops 1 reference off the stack. This is the reference to the string '.', the argument to os.listdir. (It knows to pop 1 reference because the operand of CALL_FUNCTION is 1. If the function took more arguments, there would be more LOAD opcodes and the operand of the CALL_FUNCTION opcode would be higher.) It pops another reference off the stack, which is the reference to the os.listdir function. It then calls the function with the arguments. The return value of the function is then pushed on the stack, where it can be used by further opcodes.

As you have discovered, the names os and listdir are stored in a table, co_names. The operands to the LOAD_NAME and LOAD_ATTR opcodes are indices into this table. The '.' is handled similarly, except it's stored in the co_consts table.

Upvotes: 5

Related Questions