Reputation: 43
I have a txt file like this:
ASP62-Main-N LYS59-Main-O 100.00%
THR64-Side-OG1 VAL60-Main-O 100.00%
ALA66-Main-N LEU61-Main-O 100.00%
LYS33-Main-N SER30-Main-O 100.00%
I want to get the number before "-Main" or "-Side",the result like this:
62 59
64 60
66 61
33 30
And I wrote some codes,but the result only show the number berore "-Main".
f1 = open(filename1)
for line in f1.readlines():
N=re.compile(r'(\d+)-Main|-Side')
n=N.findall(line)
print (n)
The result is shown below:
['62', '59']
['', '60']
['66', '61']
['33', '30']
please someone give me some advice.
Upvotes: 2
Views: 70
Reputation: 107095
As @JosephSible has mentioned, you should group the patterns in your alternation since alternation has a low precedence, but in this case you should use a non-capturing group for -Main
and -Side
since you don't actually want them in your output:
N=re.compile(r'(\d+)(?:-Main|-Side)')
Alternatively, you can use a lookahead pattern so you don't need any capturing group:
N=re.compile(r'\d+(?=-Main|-Side)')
Upvotes: 2
Reputation: 71610
Or this as full code:
import re
with open('filename.txt','r') as f:
for i in f:
print(' '.join(re.findall('\d{2}',i)[:-2]))
Output:
62 59
64 60
66 61
33 30
Upvotes: 2
Reputation: 48672
It's a precedence issue. Alternation happens late enough that your regex was being parsed as "numbers followed by -Main" or "-Side". Use this regex instead: (\d+)(-Main|-Side)
Upvotes: 1