Reputation: 71
I understand how in recursion every recursive call stacks on the stack; if the stack limit is exceeded, you get a stack overflow.
Why then does Python's sys.getrecursionlimit()
return a number; a max depth of recursive calls?
Does it not depend on what I do in that recursive function? Or does it somehow save the variables somewhere else, other than the stack? How does it work?
Upvotes: 4
Views: 2934
Reputation: 365657
The easy, if oversimplified, way to think about this is that the Python stack is not actually a giant array with all the frames concatenated, but a linked list of frames.1 But even that can be misleading if you're thinking in, say, C terms. Which you seem to be:
Or does it somehow save the variables somewhere else, other than the stack?
It does—in CPython, local variables2 are stored in an array on the heap-allocated frame object—but that usually isn't the relevant question.
In C, a variable is a typed memory location. When you write int lst[100];
, that allocates 400 bytes on the stack and names it lst
.
In Python, a variable is just a name (in some namespace) for a value. Memory locations (and types) are a property of values, not variables, and they always live somewhere in the heap.3 Variables are just references to them. So, if you write lst = [0]*100
, that's just 8 bytes for the variable (pointer) in the locals array, and then 864 bytes for the list object on the heap.4
The RecursionError
limit is there because most Python code that goes to a depth of 1000 is probably going to just take a very long time allocate a whole bunch of Python frames before failing on either a MemoryError
or a stack overflow segfault, so it's better to stop you before allocating all that memory and burning all that CPU.
More importantly, as tdelaney points out in a comment, recovering from either of those conditions is very difficult in Python—but recovering from a RecursionError
is pretty simple; it unwraps the stack to the top of the recursion for you and leaves you in a predictable state.
But that rule of thumb doesn't apply to every program, just most of them—so if you know you have an algorithm that may go a few thousand frames deep without any problems, Python lets you increase the limit to, say, 10000 instead of 1000.
1. This is oversimplified because (at least in CPython) the interpreter is often actually chaining up calls on the C stack—but it's still useful to remember that there's a new frame object (and the other stuff the frame allocates) being heap allocated every time you recurse in Python, whether the interpreter is recursing or not. (Especially since Python is defined as never doing tail call elimination at the Python level, even if the interpreter actually does so in the eval loop.)
2. Technically, in Python, all variables are stored in a namespace, a mapping from names to references to values. But CPython optimizes local variables by storing an array of pointers, and then having the compiler convert local references into array lookups instead of mapping lookups.
3. And of course that "somewhere" is unspecified—Python is garbage-collected, whether using automatic refcounting plus a cycle detector as in CPython, or whatever the underlying JVM uses as in Jython. But in CPython, there's also a defined C API, where objects are C pointers to structs—and you can see the value of this pointer with the id
function.
4. Also, that 864 bytes is mostly just a list of 100 pointers to a single immutable 0
object, unlike C, where there are 100 separate mutable int
slots that all have the value 0
in them.
Upvotes: 14