Explogit
Explogit

Reputation: 167

str.format(list) with negative index doesn't work in Python

I use a negative index in replacement fields to output a formatted list,but it raises a TypeError.The codes are as follows:

>>> a=[1,2,3]
>>> a[2]
3
>>> a[-1]
3
>>> 'The last:{0[2]}'.format(a)
'The last:3'
>>> 'The last:{0[-1]}'.format(a)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: list indices must be integers, not str

Upvotes: 12

Views: 2585

Answers (5)

astromancer
astromancer

Reputation: 611

It's relatively straight forward to extend the builtin string.Formatter to allow this case. The code below adds 2 lines to the existing implementation to resolve negative indices as integers.

from _string import formatter_field_name_split
from string import Formatter as BuiltinFormatter


class Formatter(BuiltinFormatter):
    """
    Implements negative indexing for format fields.
    """

    def get_field(self, field_name, args, kws):
        # eg: field_name = '0[name]' or 'label.title' or 'some_keyword'
        # OR 'key[-1]' implemented here
        first, rest = formatter_field_name_split(field_name)
        obj = self.get_value(first, args, kws)

        # loop through the rest of the field_name, doing
        #  getattr or getitem as needed
        for is_attr, i in rest:
            if is_attr:
                obj = getattr(obj, i)
            else:
                # fix to allow negative indexing next two lines :)
                if i.startswith('-') and i[1:].isdigit():
                    i = int(i)
                obj = obj[i]

        return obj, first


# test
if __name__ == '__main__':
    formatter = Formatter()
    print(formatter.format('hello {worlds[-1]}!', 
                            worlds=('earth', 'mars', ..., 'world')))

Which outputs hello world! as desired.

Upvotes: 0

splidje
splidje

Reputation: 199

I often take Python format strings as config options - with the format string provided with a specific, known list of keyword arguments. Therefore addressing the indexes of a variable length list forwards or backwards within the format string is exactly the kind of thing I end up needing.

I've just written this hack to make the negative indexing work:

string_to_tokenise = "Hello_world"
tokens = re.split(r"[^A-Z\d]+", string_to_tokenise, flags=re.I)
token_dict = {str(i) if i < 0 else i: tokens[i] for i in range(-len(tokens) + 1, len(tokens))}
print "{thing[0]} {thing[-1]}".format(thing=token_dict)

Result:

Hello world

So to explain, instead of passing in the list of tokens, I create a dictionary with all the required integer keys for indexing the list from 0 to len(..)-1, and I also add the negative integer keys for indexing from the end from -1 to -(len(..)-1), however these keys are converted from integers to strings, as that's how format will interpret them.

Upvotes: 0

Alex Martelli
Alex Martelli

Reputation: 882421

It's what I would call a design glitch in the format string specs. Per the docs,

element_index     ::=  integer | index_string

but, alas, -1 is not "an integer" -- it's an expression. The unary-minus operator doesn't even have particularly high priority, so that for example print(-2**2) emits -4 -- another common issue and arguably a design glitch (the ** operator has higher priority, so the raise-to-power happens first, then the change-sign requested by the lower priority unary -).

Anything in that position in the format string that's not an integer (but, for example, an expression) is treated as a string, to index a dict argument -- for example:

$ python3 -c "print('The last:{0[2+2]}'.format({'2+2': 23}))"
The last:23

Not sure whether this is worth raising an issue in the Python trac, but it's certainly a somewhat surprising behavior:-(.

Upvotes: 16

John Machin
John Machin

Reputation: 83002

There are a few problems here, once you start digging:

The item in question is called "element_index" which is defined to be an integer.

Problem 1: unless users follow the link from "integer" to the language reference manual, they won't know that -1 is deemed to be an expression, not an integer. By the way, anyone tempted to say "works as documented" should see proplem 7 first :-)

Preferred solution: change the definition so that "element_index" can have an optional '-' before the integer.

It's an integer, right? Not so fast ... later the docs say that "an expression of the form '[index]' does an index lookup using __getitem__()"

Problem 3: Should say '[element_index]' (index is not defined).

Problem 4: Not everybody knows off the top of their heads what __getitem__() does. Needs clearer docs.

So we can use a dict here as well as an integer, can we? Yes, with a problem or two:

The element_index is a integer? Yes, that works with a dict:

>>> "{0[2]}".format({2: 'int2'})
'int2'

It seems that we can also use non-integer strings, but this needs more explicit documentation (Problem 5):

>>> "{0[foo]}".format({'foo': 'bar'})
'bar'

But we can't use a dict with a key like '2' (Problem 6):

>>> "{0[2]}".format({'2': 'str2'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 2
>>> "{0['2']}".format({'2': 'str2'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: "'2'"

Problem 7: That "integer" should really be documented to be "decimalinteger" ... 0x22 and 0b11 are treated as str, and 010 (an "octalinteger") is treated as 10, not 8:

>>> "{0[010]}".format('0123456789abcdef')
'a'

Update: PEP 3101 tells the true story:
"""
The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.

Because keys are not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings "10" or ":-]") from within a format string.
"""

Upvotes: 4

Lennart Regebro
Lennart Regebro

Reputation: 172319

Correct, it does not work. solution:

>>> 'The last:{0}'.format(a[-1])
'The last:3'

Upvotes: 1

Related Questions