rahulchem
rahulchem

Reputation: 123

Parsing linear equation with Parameters using REGEX

I am trying to parse the linear equation using REGEX. Equation looks as follow:

2 * var1.val          + 7 * var2 + 9 * var3 = 1
3 * var1.val          + 4 * var2            = 9
param1.val * var1.val            + 4 * var3 = 7

The coefficients can be numeric or parameters. I want to get result as:

[2,         7,  9
3,          4,  0
param1.val, 0,  4]

I googled and found a sample code which I modified to meet my need. It looks like:

equations = [' 2 * var1.name + 7 * var2 + 9 * var3 = 1',
             ' 3 * var1.name + 4 * var2 = 9',
             ' param1.val * var1.name + 4 * var3 = 7']
augmented_matrix = {'__b__':[0]*3}                 # initialize the RHS vector
parse_ptrn = r'([+-]?[\d*|\w*][\*]+)(\w+\.?\w+)'        
parse_obj = re.compile(parse_ptrn)

for i in range(3):
    e = ''.join(equations[i].split())              # split and join to remove spaces
    left, right = e.split('=')                     # separate RHS and LHS

    try:
        augmented_matrix['__b__'][i] = float(right) # if possible convert RHS to float
    except:
        augmented_matrix['__b__'][i] = right
    # FOR LHS
    for coeff, var in parse_obj.findall(left):
        if coeff == '': coeff = 1
        elif coeff == '-': coeff = -1
        else:
            try:
                coeff = float(coeff.replace("*","")) # convert to float/Remove * from coeff
            except:
                coeff = coeff.replace("*","")
        if var not in augmented_matrix:
            augmented_matrix[var] = [0] * 3 
        augmented_matrix[var][i] = coeff
    print left, right
    print parse_obj.findall(left)

It is not able correctly parse the third equation because of the parameters. For first coefficient in third equation, it gives me last letter "l" instead of "param.val". I believe REGEX ([+-]?[\d*|\w*][\*]+) should be able to find anything between start of the string and * (either digit or characters).

Please help me.

Upvotes: 2

Views: 1282

Answers (2)

user557597
user557597

Reputation:

As a side note, if you know the order of the equations (like 3), it is
possible to do it all in a single regex.

 # ^(?=.*?\S+\s*\*\s*var[123]).*?(?:(\S+)\s*\*\s*var1.+?)?(?:(\S+)\s*\*\s*var2.+?)?(?:(\S+)\s*\*\s*var3.+?)?$

 ^ 
 (?=
      .*? \S+ 
      \s* \* \s* var [123] 
 )
 .*? 
 (?:
      ( \S+ )                       # (1)
      \s* \* \s* var1
      .+? 
 )?
 (?:
      ( \S+ )                       # (2)
      \s* \* \s* var2
      .+? 
 )?
 (?:
      ( \S+ )                       # (3)
      \s* \* \s* var3
      .+? 
 )?
 $     

Output:

 **  Grp 0 -  ( pos 0 , len 47 ) 
2 * var1.val          + 7 * var2 + 9 * var3 = 1  
 **  Grp 1 -  ( pos 0 , len 1 ) 
2  
 **  Grp 2 -  ( pos 24 , len 1 ) 
7  
 **  Grp 3 -  ( pos 35 , len 1 ) 
9  
-------------
 **  Grp 0 -  ( pos 49 , len 47 ) 
3 * var1.val          + 4 * var2            = 9  
 **  Grp 1 -  ( pos 49 , len 1 ) 
3  
 **  Grp 2 -  ( pos 73 , len 1 ) 
4  
 **  Grp 3 -  NULL 
-------------
 **  Grp 0 -  ( pos 98 , len 47 ) 
param1.val * var1.val            + 4 * var3 = 7  
 **  Grp 1 -  ( pos 98 , len 10 ) 
param1.val  
 **  Grp 2 -  NULL 
 **  Grp 3 -  ( pos 133 , len 1 ) 
4  

Upvotes: 0

Jake Griffin
Jake Griffin

Reputation: 2074

Try using this for your regular expression instead:

parse_ptrn = r'([+-]?[\w.]*\*)(\w+\.?\w+)'

I changed [\d*|\w*] (i.e., ONE character which is a digit \d, asterisk *, pipe | or word character \w) to [\w.]+ (i.e., AT LEAST ONE word character or decimal point). Note that \d is not necessary because it is a subset of \w (all digits are word characters). Also, your original code would not have worked for multi-digit coefficients, like 10, because it was only selecting ONE character before the *.

Please note that this will still not work for equations like var1.val + 4 * var2 = 9 due to the lack of coefficient and * in front of the first variable, var1.val. I will leave this as an exercise for you, but if you have trouble with it, just comment on this answer and I will update it to include that case as well (I assume you would want a coefficient of 1 in that case)?

Upvotes: 1

Related Questions