Reputation: 167
I use a negative index in replacement fields to output a formatted list,but it raises a TypeError.The codes are as follows:
>>> a=[1,2,3] >>> a[2] 3 >>> a[-1] 3 >>> 'The last:{0[2]}'.format(a) 'The last:3' >>> 'The last:{0[-1]}'.format(a) Traceback (most recent call last): File "", line 1, in TypeError: list indices must be integers, not str
Upvotes: 12
Views: 2585
Reputation: 611
It's relatively straight forward to extend the builtin string.Formatter
to allow this case. The code below adds 2 lines to the existing implementation to resolve negative indices as integers.
from _string import formatter_field_name_split
from string import Formatter as BuiltinFormatter
class Formatter(BuiltinFormatter):
"""
Implements negative indexing for format fields.
"""
def get_field(self, field_name, args, kws):
# eg: field_name = '0[name]' or 'label.title' or 'some_keyword'
# OR 'key[-1]' implemented here
first, rest = formatter_field_name_split(field_name)
obj = self.get_value(first, args, kws)
# loop through the rest of the field_name, doing
# getattr or getitem as needed
for is_attr, i in rest:
if is_attr:
obj = getattr(obj, i)
else:
# fix to allow negative indexing next two lines :)
if i.startswith('-') and i[1:].isdigit():
i = int(i)
obj = obj[i]
return obj, first
# test
if __name__ == '__main__':
formatter = Formatter()
print(formatter.format('hello {worlds[-1]}!',
worlds=('earth', 'mars', ..., 'world')))
Which outputs
hello world!
as desired.
Upvotes: 0
Reputation: 199
I often take Python format strings as config options - with the format string provided with a specific, known list of keyword arguments. Therefore addressing the indexes of a variable length list forwards or backwards within the format string is exactly the kind of thing I end up needing.
I've just written this hack to make the negative indexing work:
string_to_tokenise = "Hello_world"
tokens = re.split(r"[^A-Z\d]+", string_to_tokenise, flags=re.I)
token_dict = {str(i) if i < 0 else i: tokens[i] for i in range(-len(tokens) + 1, len(tokens))}
print "{thing[0]} {thing[-1]}".format(thing=token_dict)
Result:
Hello world
So to explain, instead of passing in the list of tokens, I create a dictionary with all the required integer keys for indexing the list from 0 to len(..)-1, and I also add the negative integer keys for indexing from the end from -1 to -(len(..)-1), however these keys are converted from integers to strings, as that's how format will interpret them.
Upvotes: 0
Reputation: 882421
It's what I would call a design glitch in the format string specs. Per the docs,
element_index ::= integer | index_string
but, alas, -1
is not "an integer" -- it's an expression. The unary-minus operator doesn't even have particularly high priority, so that for example print(-2**2)
emits -4
-- another common issue and arguably a design glitch (the **
operator has higher priority, so the raise-to-power happens first, then the change-sign requested by the lower priority unary -
).
Anything in that position in the format string that's not an integer (but, for example, an expression) is treated as a string, to index a dict argument -- for example:
$ python3 -c "print('The last:{0[2+2]}'.format({'2+2': 23}))"
The last:23
Not sure whether this is worth raising an issue in the Python trac, but it's certainly a somewhat surprising behavior:-(.
Upvotes: 16
Reputation: 83002
There are a few problems here, once you start digging:
The item in question is called "element_index" which is defined to be an integer.
Problem 1: unless users follow the link from "integer" to the language reference manual, they won't know that -1 is deemed to be an expression, not an integer. By the way, anyone tempted to say "works as documented" should see proplem 7 first :-)
Preferred solution: change the definition so that "element_index" can have an optional '-' before the integer.
It's an integer, right? Not so fast ... later the docs say that "an expression of the form '[index]' does an index lookup using __getitem__()
"
Problem 3: Should say '[element_index]' (index is not defined).
Problem 4: Not everybody knows off the top of their heads what __getitem__()
does. Needs clearer docs.
So we can use a dict here as well as an integer, can we? Yes, with a problem or two:
The element_index is a integer? Yes, that works with a dict:
>>> "{0[2]}".format({2: 'int2'})
'int2'
It seems that we can also use non-integer strings, but this needs more explicit documentation (Problem 5):
>>> "{0[foo]}".format({'foo': 'bar'})
'bar'
But we can't use a dict with a key like '2' (Problem 6):
>>> "{0[2]}".format({'2': 'str2'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 2
>>> "{0['2']}".format({'2': 'str2'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: "'2'"
Problem 7: That "integer" should really be documented to be "decimalinteger" ... 0x22 and 0b11 are treated as str, and 010 (an "octalinteger") is treated as 10, not 8:
>>> "{0[010]}".format('0123456789abcdef')
'a'
Update: PEP 3101 tells the true story:
"""
The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.
Because keys are not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings "10" or ":-]") from within a format string.
"""
Upvotes: 4
Reputation: 172319
Correct, it does not work. solution:
>>> 'The last:{0}'.format(a[-1])
'The last:3'
Upvotes: 1