Victor K
Victor K

Reputation: 559

Check modulo arithmetics conditions using regexp

I'm writing a script to parse text files (csv to be precise) and I want to pick lines from files based on each line content. There are a number of string conditions to check, so I surmised regexp is the way to go, but I also need to check a number in a beginning of a line against conditions in modulo arithmetics, so far it's n%4==k and n%2==k. It seems however that there are only ad hoc solutions. n%2==k is pretty straightforward, but to check n%4==2 I had to devise something like this:

r'((^\d*[24680]|^)[26]|^\d*[13579][048])[\s;,].*' # more (unrelated) conditions follow

My questions are:

  1. Is there a way to simplify the regexp above? Are there any obvious problems with it?
  2. If I want to generalize the script to other modulo conditions (e.g. n%3==k or n%7==k), is there a feasible way to do it with regexp, or I'd better extract a number from string and write additional code to check such conditions.

Upvotes: 0

Views: 1147

Answers (1)

georg
georg

Reputation: 214999

This seems to be more accurate for n%4==2 (ref: http://en.wikipedia.org/wiki/Divisibility_rule)

r = r'^[26]$|^\d*[02468][26]$|^\d*[13579][048]$'

# test
for i in xrange(1, 1000):
    m = re.match(r, str(i))
    if i % 4 == 2:
        assert m, [i, i % 4]
    else:
        assert not m, i

For n%3==0 see Regex filter numbers divisible by 3. I'm not aware of any generic solution for mod n, in any case it would be an interesting but purely theoretical exercise. In real life, just use ints.

Upvotes: 1

Related Questions