MikeG
MikeG

Reputation: 4035

What are the default slice indices *really*?

From the python documentation docs.python.org/tutorial/introduction.html#strings:

Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.

For the standard case, this makes a lot of sense:

>>> s = 'mystring'
>>> s[1:]
'ystring'
>>> s[:3]
'mys'
>>> s[:-2]
'mystri'
>>> s[-1:]
'g'
>>> 

So far, so good. However, using a negative step value seems to suggest slightly different defaults:

>>> s[:3:-1]
'gnir'
>>> s[0:3:-1]
''
>>> s[2::-1]
'sym'

Fine, perhaps if the step is negative, the defaults reverse. An ommitted first index defaults to the size of the string being sliced, an omitted second index defaults to zero:

>>> s[len(s):3:-1]
'gnir'

Looking good!

>>> s[2:0:-1]
'sy'

Whoops. Missed that 'm'.

Then there is everyone's favorite string reverse statement. And sweet it is:

>>> s[::-1]
'gnirtsym'

However:

>>> s[len(s):0:-1]
'gnirtsy'

The slice never includes the value of the second index in the slice. I can see the consistency of doing it that way.

So I think I am beginning to understand the behavior of slice in its various permutations. However, I get the feeling that the second index is somewhat special, and that the default value of the second index for a negative step can not actually be defined in terms of a number.

Can anyone concisely define the default slice indices that can account for the provided examples? Documentation would be a huge plus.

Upvotes: 23

Views: 6011

Answers (8)

Ben
Ben

Reputation: 21625

Great question. I thought I knew how slicing worked until I read this post. While your question title asks about "default slice indices" and that's been answered by abarnet, Martijn, and others, the body of your post suggests your real question is "How does slicing work". So, I'll take a stab at that..

Explanation

Given your example, s = “mystring”, you can imagine a set of positive and negative indices.

 m  y  s  t  r  i  n  g
 0  1  2  3  4  5  6  7 <- positive indices
-8 -7 -6 -5 -4 -3 -2 -1 <- negative indices

We select slices of the form s[i:j:k]. The logic changes depending on whether k is positive or negative. I would describe the algorithm as follows.

if k is empty, set k = 1

if k is positive:
  move right, from i (inclusive) to j (exclusive) stepping by abs(k)
  if i is empty, start from the left edge
  if j is empty, go til the right edge

if k is negative:
  move left, from i (inclusive) to j (exclusive) stepping by abs(k)
  if i is empty, start from the right edge
  if j is empty, go til the left edge

(Note this isn't exactly pseudo code, as I intended it to be more comprehendible.)


Examples

>>> s[:3:]
'mys'

Here, k is empty so we set it equal to 1. Then since k is positive, we move right from i to j. Since i is empty, we start from the left edge and select everything up to but excluding the element at index 3.

>>> s[:3:-1]
'gnir'

Here, k is negative, so we move left from i to j. Since i is empty, we start from the right edge and select everything up to but excluding the element at index 3.

>>> s[0:3:-1]
''

Here, k is negative, so we move left from i to j. Since index 3 isn't to the left of index 0, no elements are selected and we get back the empty string.

Upvotes: 1

Ajeet Ganga
Ajeet Ganga

Reputation: 8653

There are excellent answers and the best one is selected as accepted answer, but if you are looking for a way to wrap your head around default values for slice, then it helps to imagine list as having two ends. Starting with HEAD end then the first element and so on, until the TAIL end after the last element.

Now answering the actual question:

There are two defaults for the slices

  1. Defaults when step is +ve

    0:TAIL:+ve step

  2. Defaults when step is -ve

    HEAD:-1:-ve step

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1122392

The end value is always exclusive, thus the 0 end value means include index 1 but not 0. Use None instead (since negative numbers have a different meaning):

>>> s[len(s)-1:None:-1]
'gnirtsym'

Note the start value as well; the last character index is at len(s) - 1; you may as well spell that as -1 (as negative numbers are interpreted relative to the length):

>>> s[-1:None:-1]
'gnirtsym'

Upvotes: 7

Kevin Smyth
Kevin Smyth

Reputation: 1947

Useful to know if you are implementing __getslice__: j defaults to sys.maxsize (https://docs.python.org/2/reference/datamodel.html#object.getslice)

>>> class x(str):
...   def __getslice__(self, i, j):
...     print i
...     print j
...
...   def __getitem__(self, key):
...     print repr(key)
...
>>> x()[:]
0
9223372036854775807
>>> x()[::]
slice(None, None, None)
>>> x()[::1]
slice(None, None, 1)
>>> x()[:1:]
slice(None, 1, None)
>>> import sys
>>> sys.maxsize
9223372036854775807L

Upvotes: 1

abarnert
abarnert

Reputation: 365807

There actually aren't any defaults; omitted values are treated specially.

However, in every case, omitted values happen to be treated in exactly the same way as None. This means that, unless you're hacking the interpreter (or using the parser, ast, etc. modules), you can just pretend that the defaults are None (as recursive's answer says), and you'll always get the right answers.

The informal documentation cited isn't quite accurate—which is reasonable for something that's meant to be part of a tutorial. For the real answers, you have to turn to the reference documentation.

For 2.7.3, Sequence Types describes slicing in notes 3, 4, and 5.

For [i:j]:

… If i is omitted or None, use 0. If j is omitted or None, use len(s).

And for [i:j:k]:

If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.

For 3.3, Sequence Types has the exact same wording as 2.7.3.

Upvotes: 22

Greg Hewgill
Greg Hewgill

Reputation: 993403

The notes in the reference documentation for sequence types explains this in some detail:

(5.) The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). If i or j is greater than len(s), use len(s). If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.

So you can get the following behaviour:

>>> s = "mystring"
>>> s[2:None:-1]
'sym'

Upvotes: 4

Mahmoud Aladdin
Mahmoud Aladdin

Reputation: 546

Actually it is logical ...

if you look to the end value, it always points to the index after the last index. So, using 0 as the end value, means it gets till element at index 1. So, you need to omit that value .. so that it returns the string you want.

>>> s = '0123456789'
>>> s[0], s[:0]
('0', '')
>>> s[1], s[:1]
('1', '0')
>>> s[2], s[:2]
('2', '01')
>>> s[3], s[:3]
('3', '012')
>>> s[0], s[:0:-1]
('0', '987654321')

Upvotes: 1

recursive
recursive

Reputation: 86084

I don't have any documentation, but I think the default is [None:None:None]

>>> "asdf"[None:None:None]
'asdf'
>>> "asdf"[None:None:-1]
'fdsa'

Upvotes: 4

Related Questions