Reputation: 526

Extract variables using python regex

Input file contains following lines:

a=b*c;
d=a+2;
c=0;
b=a;

Now for each line I want to extract variables that has been used.For example, for line 1, the output should be [a,b,c].Currently I am doing as follows :

var=[a,b,c,d]     # list of variables
for line in file_ptr :
    if '=' in line :
        temp=line.split('=') :
        ans=list(temp[0])
        if '+' in temp[1] :
             # do something
        elif '*' in temp[1] :
             # do something
        else :
             # single variable as line 4  OR constant as line 3

Is it possible to do this using regex?

EDIT:

Expected output for above file :

[a,b,c]
[d,a]
[c]
[a,b]

Upvotes: 0

Answers (5)

Wiktor Stribiżew

Reputation: 626853

I'd use some shorter pattern for matching variable names:

import re
strs = ['a=b*c;', 'd=a+2;', 'c=0;', 'b=a;']
print([re.findall(r'[_a-z]\w*', x, re.I) for x in strs])

See the Python demo

Pattern matches:

[_a-z] - a _ or an ASCII letter (any upper or lowercase due to the case insensitive modifier use re.I)
\w* - 0 or more alphanumeric or underscore characters.

See the regex demo

Upvotes: 1

Robᵩ

Reputation: 168626

I would use re.findall() with whatever pattern matches variable names in the example's programming language. Assuming a typical language, this might work for you:

import re

lines = '''a=b*c;
d=a+2;
c=0;
b=a;'''

for line in lines.splitlines():
    print re.findall('[_a-z][_a-z0-9]*', line, re.I)

Upvotes: 1

AkaSh

Reputation: 526

This is how I did :

l=re.split(r'[^A-Za-z]', 'a=b*2;')
l=filter(None,l)

Upvotes: 0

Daniel Roseman

Reputation: 599610

I'm not entirely sure what you're after, but you can do something like this:

re.split(r'[^\w]', line)

to give a list of the alphabetic characters in the line:

>>> re.split(r'[^\w]', 'a=b*c;')
['a', 'b', 'c', '']

Upvotes: 0

lesingerouge

Reputation: 1168

If you want just the variables, then do this:

answer = []
for line in file_ptr :
    temp = []
    for char in line:
        if char.isalpha():
            temp.append(char)
    answer.append(temp)

A word of caution though: this would work only with variables that are exactly 1 character in length. More details about isalpha() can be found here or here.

Upvotes: 0

Extract variables using python regex

Answers (5)

Related Questions