BaRud
BaRud

Reputation: 3218

regex with python error

I am learning python, and trying to use regex. I am used to do that with shell script (awk, grp and sed), but need to do that with python.

in my file, I have lines like:

species,subl,cmp=    1    7    1    s1,torque=-0.65079E-11-0.59320E-15
species,subl,cmp=    1    6    1    s1,torque= 0.30782E-10 0.65641E-14

in shell script, i can do this with

var_s1=`grep "species,subl,cmp=    $3    $4    $5" $tfile |sed -r 's/.*(.{11}).{12}/\1/'`

but, trying to do this with python code:

#!/usr/bin/python
import sys,math,re

infile=sys.argv[1]; oufile=sys.argv[2]
ifile=open(infile, 'r'); ofile=open(oufile, 'w')
pattern=r'species,subl,cmp=\s{4}(.*)\s{4}(.*)\s{4}(.*)\s{3}s1,torque=(.*)\s{1}(.*)'

ssc1=[];ssc2=[];ssc3=[]; s1=[]; t=[]
for line in ifile:
  match = re.search(pattern, line)
  if match:
    ssc1.   append(int(match.group(1)))
    ssc2.   append(int(match.group(1)))
    ssc3.   append(int(match.group(1)))
    s1.     append(float(match.group(1)))
    t.      append(float(match.group(1)))
#    ofile.write('%g %g %g' %(ssc1, s1,t))
#print('%5.3e %5.3e' s1,t)
for i in range(len(t)):
  print('%g %g %g' % (ssc1[i], s1[i], t[i]))

ifile.close(); ofile.close()

gives all result as 1:

$ python triel2.py out-Dy-eos2 tres
1 1 1
1 1 1

Kindly show me where I am going wrong. I am following this book. But as a beginner, kindly,suggest me better approach as well.

Upvotes: 2

Views: 76

Answers (1)

Warren Weckesser
Warren Weckesser

Reputation: 114821

Change this:

ssc1.   append(int(match.group(1)))
ssc2.   append(int(match.group(1)))
ssc3.   append(int(match.group(1)))
s1.     append(float(match.group(1)))
t.      append(float(match.group(1)))

to this:

ssc1.   append(int(match.group(1)))
ssc2.   append(int(match.group(2)))
ssc3.   append(int(match.group(3)))
s1.     append(float(match.group(4)))
t.      append(float(match.group(5)))

It looks like you might have a problem with the text after "torque". In the first line of your example from the file, there is no space between the numbers. You could split those two numbers based on field width rather than the separator. One way to do this is to replace this part of the regular expression:

torque=(.*)\s{1}(.*)

with this:

torque=(.{12})(.{12})

That assumes that the numbers after "torque" each use a field width of 12 characters.

An alternative would be to match everything after "torque" with "(.*)", and then use python string slicing to pull apart the matched text.

Upvotes: 1

Related Questions