Reputation: 164673
I understand that conditional expressions (or ternary operators) are lazy in Python. They represent conditional execution rather than conditional selection. In other words, only one of a
or b
is evaluated in the following:
c = a if condition else b
What I'm interested to know is how this is implemented internally. Does Python convert to an if
statement as below and, if so, at what stage does this conversion occur?
if condition:
c = a
else:
c = b
Or is the ternary operator actually a distinct and separate expression, defined entirely separately? If so, can I access the CPython code for conditional expressions?
I've looked at the following which explain what the ternary operator does, but none of them make clear how they are implemented:
Edit: You can assume CPython reference implementation.
Upvotes: 11
Views: 5968
Reputation: 1121972
Python doesn't have to convert anything, and couldn't if it wanted to.
The conditional expression is parsed by using the language grammar into an abstract syntax tree, which in turn is then compiled to bytecode. You can produce the AST by using the ast.parse()
function:
>>> import ast
>>> ast.parse('c = a if condition else b').body[0] # first statement in the tree
<_ast.Assign object at 0x10f05c550>
>>> ast.dump(ast.parse('c = a if condition else b').body[0])
"Assign(targets=[Name(id='c', ctx=Store())], value=IfExp(test=Name(id='condition', ctx=Load()), body=Name(id='a', ctx=Load()), orelse=Name(id='b', ctx=Load())))"
Note the ast.IfExp()
node in the AST produced for the assignment; this is a dedicated node for conditional expressions. It has test
, body
and orelse
parts to represent the 3 expressions that make up the condition, true and false parts. This is documented in the ast
module Abstract Grammar section:
expr = [...] | [...] | IfExp(expr test, expr body, expr orelse)
This shows that the type of each element is another expr
expression node.
The parse tree is then compiled to bytecode that uses the stack to conditionally jump to the right section based on the test; we can pass the AST produced by ast.parse()
directly to the compile()
function, after which the dis
module lets us look at a human-friendly form of the bytecode produced by compilation:
>>> import dis
>>> dis.dis(compile(ast.parse('c = a if condition else b'), '', 'exec'))
1 0 LOAD_NAME 0 (condition)
2 POP_JUMP_IF_FALSE 8
4 LOAD_NAME 1 (a)
6 JUMP_FORWARD 2 (to 10)
>> 8 LOAD_NAME 2 (b)
>> 10 STORE_NAME 3 (c)
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
So if the condition is false, the interpreter loop jumps forward to instruction 8, otherwise instructions 4 and 6 are executed, with instruction 6 jumping forward to instruction 10 (so past the else
expression). The end result is that either instruction 4 or instruction 8 puts a new result on the top of the stack for STORE_NAME
to move to a variable.
An if
statement results in a different AST node, and the resulting bytecode happens to be very similar in that it too would use jumps. But the compiler treats them as distinct pieces of syntax, and it has to.
Expressions and statements are two very different fundamental building blocks of a programming language. Statements can contain expressions, but expressions can't contain statements, only other expressions. And expressions can produce a value (for the surrounding syntax to use), but statements can't. So Python has to treat conditional expressions very differently from statements, in that the grammar parser knows when to expect a statement and when an expression is allowed. If you transformed a conditional expression into a statement, you would not be able to ever use such an expression as part of a bigger expression!
Because an if
statement is not an expression, it doesn't return a value (as only expressions can produce a value), and so the resulting bytecode would not produce a value on the top of the stack to be used by the surrounding Python code (there is no c = if condition : ...
). if
statements contain a condition expression, and a suite, which must always consist of more statements (there is such a thing as an 'expression statement' to let you put just an expression in a statement, such as 1 + 1
on a single line), and those statements can 'do stuff' like assignments or return from a function, but nothing they do would ever make if
return something.
This is reflected in the AST node definition for if
statements:
stmt = [...] | [...] | If(expr test, stmt* body, stmt* orelse)
So for an If
node, test
is the only expression node, and body
and orelse
both consist of zero or more statements. The orelse
part would hold any elif ...:
tests as further If()
nodes, or any other type of statement to form an unconditional else:
. With zero-or-more elements, you can't expect a single result.
So this isn't unique to CPython, this applies to all Python implementations. The Python grammar is not an implementation detail.
Upvotes: 12
Reputation: 7735
If you are asking what, then to best understand it, you need to understand the difference between functional and procedural. One can be converted to the other, but both can be view independently, you don't have to translate one to the other to understand them.
value_a if condition else value_b
is functional, and returns a value value_a
or value_b
.
if condition then:
do_a
else:
do_b
is procedural, it does do_a
or do_b
.
Note: Procedural is about doing, do this, then do that, or that. Functional is about value, is this or is that.
If you are asking how, then you will need to look at the source-code of one of the implementations. Note that each implementation does not have to do it the same way, so long as behaviour is correct.
Upvotes: 0
Reputation: 81604
Does Python convert to an if statement as below
Almost.
import dis
def trenary():
x = 'a' if 1 == 1 else 'b'
def normal_if():
if 1 == 1:
c = 'a'
else:
c = 'b'
print('trenary')
dis.dis(trenary)
print()
print('normal if')
dis.dis(normal_if)
This outputs:
trenary
68 0 LOAD_CONST 1 (1)
2 LOAD_CONST 1 (1)
4 COMPARE_OP 2 (==)
6 POP_JUMP_IF_FALSE 12
8 LOAD_CONST 2 ('a')
10 JUMP_FORWARD 2 (to 14)
>> 12 LOAD_CONST 3 ('b')
>> 14 STORE_FAST 0 (x)
16 LOAD_CONST 0 (None)
18 RETURN_VALUE
normal if
71 0 LOAD_CONST 1 (1)
2 LOAD_CONST 1 (1)
4 COMPARE_OP 2 (==)
6 POP_JUMP_IF_FALSE 14
72 8 LOAD_CONST 2 ('a')
10 STORE_FAST 0 (c)
12 JUMP_FORWARD 4 (to 18)
74 >> 14 LOAD_CONST 3 ('b')
16 STORE_FAST 0 (c)
>> 18 LOAD_CONST 0 (None)
20 RETURN_VALUE
These look almost the same, except for the location of JUMP_FORWARD
and an additional STORE_FAST
as pointed out by @L3viathan.
We also get almost the same execution times (with a negligible difference):
from timeit import Timer
print(min(Timer(trenary).repeat(5000, 5000)))
print(min(Timer(normal_if).repeat(5000, 5000)))
# 0.0006442809999998023
# 0.0006442799999994975
As to when this conversion happens, I'd assume sometime during "compilation" to bytecode.
Upvotes: 2