Reputation: 526
Input file contains following lines:
a=b*c;
d=a+2;
c=0;
b=a;
Now for each line I want to extract variables that has been used.For example, for line 1, the output should be [a,b,c]
.Currently I am doing as follows :
var=[a,b,c,d] # list of variables
for line in file_ptr :
if '=' in line :
temp=line.split('=') :
ans=list(temp[0])
if '+' in temp[1] :
# do something
elif '*' in temp[1] :
# do something
else :
# single variable as line 4 OR constant as line 3
Is it possible to do this using regex?
EDIT:
Expected output for above file :
[a,b,c]
[d,a]
[c]
[a,b]
Upvotes: 0
Views: 554
Reputation: 626853
I'd use some shorter pattern for matching variable names:
import re
strs = ['a=b*c;', 'd=a+2;', 'c=0;', 'b=a;']
print([re.findall(r'[_a-z]\w*', x, re.I) for x in strs])
See the Python demo
Pattern matches:
[_a-z]
- a _
or an ASCII letter (any upper or lowercase due to the case insensitive modifier use re.I
)\w*
- 0 or more alphanumeric or underscore characters.See the regex demo
Upvotes: 1
Reputation: 168626
I would use re.findall()
with whatever pattern matches variable names in the example's programming language. Assuming a typical language, this might work for you:
import re
lines = '''a=b*c;
d=a+2;
c=0;
b=a;'''
for line in lines.splitlines():
print re.findall('[_a-z][_a-z0-9]*', line, re.I)
Upvotes: 1
Reputation: 526
This is how I did :
l=re.split(r'[^A-Za-z]', 'a=b*2;')
l=filter(None,l)
Upvotes: 0
Reputation: 599610
I'm not entirely sure what you're after, but you can do something like this:
re.split(r'[^\w]', line)
to give a list of the alphabetic characters in the line:
>>> re.split(r'[^\w]', 'a=b*c;')
['a', 'b', 'c', '']
Upvotes: 0
Reputation: 1168
If you want just the variables, then do this:
answer = []
for line in file_ptr :
temp = []
for char in line:
if char.isalpha():
temp.append(char)
answer.append(temp)
A word of caution though: this would work only with variables that are exactly 1 character in length. More details about isalpha()
can be found here or here.
Upvotes: 0