Find numbers in a string and decrement them

Question

I have an HTML page that lists a long index of topics and page numbers. I want to find all the page numbers and their anchor tag links and decrement the page numbers by 1.

Here is an example line in the HTML:

breakeven volume (BEV), 28

I'm trying to find the number 28 in both places and decrement by 1.

So far I've been able to find the number and replace it with itself, but I can't figure out how to decrement it. My code so far:

import fileinput
import re

for line in fileinput.input():
    line = re.sub(r'\>([0-9]+)\<', r'>\1<', line.rstrip())
    print(line)

JuniorCompressor · Accepted Answer

You can use a replacement function while substituting:

import re
s = 'breakeven volume (BEV), 28'
re.sub(r'page(\d+)">\1', lambda m: 'page{0}">{0}'.format(int(m.group(1)) - 1), s)

Result:

breakeven volume (BEV), 27

With page(\d+)">\1 we match page followed by a number, followed by a ">, followed by the same number as in the pattern in the first pair of parentheses (\1).

The substitution function takes as parameter a match. So we take the first group of the match (m.group(1)), which is the number, we parse it and decrement it. Then we reconstruct the new string using the decremented number.

Find numbers in a string and decrement them

Answers (2)

Related Questions