Reputation: 3218
I am learning python
, and trying to use regex
. I am used to do that with shell script (awk
, grp
and sed
), but need to do that with python
.
in my file, I have lines like:
species,subl,cmp= 1 7 1 s1,torque=-0.65079E-11-0.59320E-15
species,subl,cmp= 1 6 1 s1,torque= 0.30782E-10 0.65641E-14
in shell script, i can do this with
var_s1=`grep "species,subl,cmp= $3 $4 $5" $tfile |sed -r 's/.*(.{11}).{12}/\1/'`
but, trying to do this with python code:
#!/usr/bin/python
import sys,math,re
infile=sys.argv[1]; oufile=sys.argv[2]
ifile=open(infile, 'r'); ofile=open(oufile, 'w')
pattern=r'species,subl,cmp=\s{4}(.*)\s{4}(.*)\s{4}(.*)\s{3}s1,torque=(.*)\s{1}(.*)'
ssc1=[];ssc2=[];ssc3=[]; s1=[]; t=[]
for line in ifile:
match = re.search(pattern, line)
if match:
ssc1. append(int(match.group(1)))
ssc2. append(int(match.group(1)))
ssc3. append(int(match.group(1)))
s1. append(float(match.group(1)))
t. append(float(match.group(1)))
# ofile.write('%g %g %g' %(ssc1, s1,t))
#print('%5.3e %5.3e' s1,t)
for i in range(len(t)):
print('%g %g %g' % (ssc1[i], s1[i], t[i]))
ifile.close(); ofile.close()
gives all result as 1:
$ python triel2.py out-Dy-eos2 tres
1 1 1
1 1 1
Kindly show me where I am going wrong. I am following this book. But as a beginner, kindly,suggest me better approach as well.
Upvotes: 2
Views: 76
Reputation: 114821
Change this:
ssc1. append(int(match.group(1)))
ssc2. append(int(match.group(1)))
ssc3. append(int(match.group(1)))
s1. append(float(match.group(1)))
t. append(float(match.group(1)))
to this:
ssc1. append(int(match.group(1)))
ssc2. append(int(match.group(2)))
ssc3. append(int(match.group(3)))
s1. append(float(match.group(4)))
t. append(float(match.group(5)))
It looks like you might have a problem with the text after "torque". In the first line of your example from the file, there is no space between the numbers. You could split those two numbers based on field width rather than the separator. One way to do this is to replace this part of the regular expression:
torque=(.*)\s{1}(.*)
with this:
torque=(.{12})(.{12})
That assumes that the numbers after "torque" each use a field width of 12 characters.
An alternative would be to match everything after "torque" with "(.*)", and then use python string slicing to pull apart the matched text.
Upvotes: 1