Rob Fagen
Rob Fagen

Reputation: 884

Is it Pythonic to depend on ValueError when parsing a parameter string?

My specific concerns are

  1. When parsing a parameter, is it intuitive for a future maintainer to understand code that depends on throwing an error?
  2. Is it expensive to be throwing exceptions as a matter of course for the default case? (seems like it might be according to https://stackoverflow.com/a/9859202/776940 )

Context

I have a parameter counter that determines the name of a counter to increment, and optionally can increment by a positive or negative integer separated from the counter name by an =. If no increment value is provided, the default magnitude of the increment is 1. The function is fed by breaking up a comma delimited list of counters and increments, so a valid input to the whole process can look like:

"counter1,counter2=2,counter3=-1"

which would increment "counter1" by 1, increment "counter2" by 2 and decrement "counter3" by 1.

How I Originally Wrote It

counterDescriptor = counterValue.split('=')
if len(counterDescriptor) == 1:
    counterName = counterDescriptor[0]
    counterIncr = 1
elif len(counterDescriptor) == 2:
    counterName = counterDescriptor[0]
    counterIncr = int(counterDescriptor[1])
else:
    counterName, counterIncr = ('counterParsingError', 1)

which strikes me, as I recently came back to look at it, as overly verbose and clunky.

Is this a more or less Pythonic way to code that behavior?

def cparse(counter):
    try:
        desc,mag = counter.split('=')
    except ValueError:
        desc = counter
        mag = ''
    finally:
        if mag == '':
            mag = 1
    return desc, int(mag)

With these test cases, I see:

>>> cparse("byfour=4")
('byfour', 4)
>>> cparse("minusone=-1")
('minusone', -1)
>>> cparse("equalAndNoIncr=")
('equalAndNoIncr', 1)
>>> cparse("noEqual")
('noEqual', 1)

These test cases that would have been caught how I originally wrote it (above) won't get caught this way:

>>> cparse("twoEquals=2=3")
('twoEquals=2=3', 1)
>>> cparse("missingComma=5missingComma=-5")
('missingComma=5missingComma=-5', 1)

and this last test case doesn't get caught by either way of doing it. Both make the int() vomit:

>>> cparse("YAmissingComma=5NextCounter")
ValueError: invalid literal for int() with base 10: '5NextCounter'

I'm glad I discovered this problem by asking this question. The service that consumes this value would eventually choke on it. I suppose I could change the one line return desc, int(mag) of the function to this:

    if desc.find("=")<0 and (mag=='0' or (mag if mag.find('..') > -1 else mag.lstrip('-+').rstrip('0').rstrip('.')).isdigit()):
        return desc, int(mag)
    else:
        return 'counterParsingError: {}'.format(desc), 1

(hat tip to https://stackoverflow.com/a/9859202/776940 for figuring out that this was the fastest way offered in that discussion to determine if a string is an integer)

Upvotes: 0

Views: 55

Answers (1)

Adam Smith
Adam Smith

Reputation: 54213

I would consider that pythonic, though you might perhaps prefer:

def cparse(counter):
    if "=" not in counter:
        # early exit for this expected case
        return (counter, 1)
    desc, mag = counter.split("=", maxsplit=1)
    # note the use of the optional maxsplit to prevent ValueErrors on "a=b=c"
    # and since we've already tested and short-circuited out of the "no equals" case
    # we can now consider this handled completely without nesting in a try block.
    try:
        mag = int(mag)
    except ValueError:
        # can't convert mag to an int, this is unexpected!
        mag = 1
    return (desc, mag)

You can tweak this to ensure you get the right output while parsing strings like a=b=c. If you expect to receive ('a', 1) then keep the code as-is. If you expect ('a=b', 1) you can use counter.rsplit instead of counter.split.

Upvotes: 2

Related Questions