Reputation: 4035
From the python documentation docs.python.org/tutorial/introduction.html#strings:
Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.
For the standard case, this makes a lot of sense:
>>> s = 'mystring'
>>> s[1:]
'ystring'
>>> s[:3]
'mys'
>>> s[:-2]
'mystri'
>>> s[-1:]
'g'
>>>
So far, so good. However, using a negative step value seems to suggest slightly different defaults:
>>> s[:3:-1]
'gnir'
>>> s[0:3:-1]
''
>>> s[2::-1]
'sym'
Fine, perhaps if the step is negative, the defaults reverse. An ommitted first index defaults to the size of the string being sliced, an omitted second index defaults to zero:
>>> s[len(s):3:-1]
'gnir'
Looking good!
>>> s[2:0:-1]
'sy'
Whoops. Missed that 'm'.
Then there is everyone's favorite string reverse statement. And sweet it is:
>>> s[::-1]
'gnirtsym'
However:
>>> s[len(s):0:-1]
'gnirtsy'
The slice never includes the value of the second index in the slice. I can see the consistency of doing it that way.
So I think I am beginning to understand the behavior of slice in its various permutations. However, I get the feeling that the second index is somewhat special, and that the default value of the second index for a negative step can not actually be defined in terms of a number.
Can anyone concisely define the default slice indices that can account for the provided examples? Documentation would be a huge plus.
Upvotes: 23
Views: 6011
Reputation: 21625
Great question. I thought I knew how slicing worked until I read this post. While your question title asks about "default slice indices" and that's been answered by abarnet, Martijn, and others, the body of your post suggests your real question is "How does slicing work". So, I'll take a stab at that..
Given your example, s = “mystring”
, you can imagine a set of positive and negative indices.
m y s t r i n g
0 1 2 3 4 5 6 7 <- positive indices
-8 -7 -6 -5 -4 -3 -2 -1 <- negative indices
We select slices of the form s[i:j:k]
. The logic changes depending on whether k
is positive or negative. I would describe the algorithm as follows.
if k is empty, set k = 1
if k is positive:
move right, from i (inclusive) to j (exclusive) stepping by abs(k)
if i is empty, start from the left edge
if j is empty, go til the right edge
if k is negative:
move left, from i (inclusive) to j (exclusive) stepping by abs(k)
if i is empty, start from the right edge
if j is empty, go til the left edge
(Note this isn't exactly pseudo code, as I intended it to be more comprehendible.)
>>> s[:3:]
'mys'
Here, k
is empty so we set it equal to 1. Then since k
is positive, we move right from i
to j
. Since i
is empty, we start from the left edge and select everything up to but excluding the element at index 3.
>>> s[:3:-1]
'gnir'
Here, k
is negative, so we move left from i
to j
. Since i
is empty, we start from the right edge and select everything up to but excluding the element at index 3.
>>> s[0:3:-1]
''
Here, k
is negative, so we move left from i
to j
. Since index 3 isn't to the left of index 0, no elements are selected and we get back the empty string.
Upvotes: 1
Reputation: 8653
There are excellent answers and the best one is selected as accepted answer, but if you are looking for a way to wrap your head around default values for slice, then it helps to imagine list as having two ends. Starting with HEAD end then the first element and so on, until the TAIL end after the last element.
Now answering the actual question:
There are two defaults for the slices
Defaults when step is +ve
0:TAIL:+ve step
Defaults when step is -ve
HEAD:-1:-ve step
Upvotes: 0
Reputation: 1122392
The end value is always exclusive, thus the 0 end value means include index 1 but not 0. Use None instead (since negative numbers have a different meaning):
>>> s[len(s)-1:None:-1]
'gnirtsym'
Note the start value as well; the last character index is at len(s) - 1
; you may as well spell that as -1
(as negative numbers are interpreted relative to the length):
>>> s[-1:None:-1]
'gnirtsym'
Upvotes: 7
Reputation: 1947
Useful to know if you are implementing __getslice__
: j
defaults to sys.maxsize
(https://docs.python.org/2/reference/datamodel.html#object.getslice)
>>> class x(str):
... def __getslice__(self, i, j):
... print i
... print j
...
... def __getitem__(self, key):
... print repr(key)
...
>>> x()[:]
0
9223372036854775807
>>> x()[::]
slice(None, None, None)
>>> x()[::1]
slice(None, None, 1)
>>> x()[:1:]
slice(None, 1, None)
>>> import sys
>>> sys.maxsize
9223372036854775807L
Upvotes: 1
Reputation: 365807
There actually aren't any defaults; omitted values are treated specially.
However, in every case, omitted values happen to be treated in exactly the same way as None. This means that, unless you're hacking the interpreter (or using the parser
, ast
, etc. modules), you can just pretend that the defaults are None (as recursive's answer says), and you'll always get the right answers.
The informal documentation cited isn't quite accurate—which is reasonable for something that's meant to be part of a tutorial. For the real answers, you have to turn to the reference documentation.
For 2.7.3, Sequence Types describes slicing in notes 3, 4, and 5.
For [i:j]
:
… If i is omitted or
None
, use0
. If j is omitted orNone
, uselen(s)
.
And for [i:j:k]
:
If i or j are omitted or
None
, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k isNone
, it is treated like1
.
For 3.3, Sequence Types has the exact same wording as 2.7.3.
Upvotes: 22
Reputation: 993403
The notes in the reference documentation for sequence types explains this in some detail:
(5.) The slice of s from i to j with step k is defined as the sequence of items with index
x = i + n*k
such that0 <= n < (j-i)/k
. In other words, the indices arei
,i+k
,i+2*k
,i+3*k
and so on, stopping when j is reached (but never including j). If i or j is greater thanlen(s)
, uselen(s)
. If i or j are omitted orNone
, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k isNone
, it is treated like1
.
So you can get the following behaviour:
>>> s = "mystring"
>>> s[2:None:-1]
'sym'
Upvotes: 4
Reputation: 546
Actually it is logical ...
if you look to the end value, it always points to the index after the last index.
So, using 0
as the end value, means it gets till element at index 1. So, you need to omit that value .. so that it returns the string you want.
>>> s = '0123456789'
>>> s[0], s[:0]
('0', '')
>>> s[1], s[:1]
('1', '0')
>>> s[2], s[:2]
('2', '01')
>>> s[3], s[:3]
('3', '012')
>>> s[0], s[:0:-1]
('0', '987654321')
Upvotes: 1
Reputation: 86084
I don't have any documentation, but I think the default is [None:None:None]
>>> "asdf"[None:None:None]
'asdf'
>>> "asdf"[None:None:-1]
'fdsa'
Upvotes: 4