Reputation: 73
I am trying to print out the line of the matched pattern and write out the matched lines.
The number of the matched line works fine, however, neither did Python write the content in the new file, nor did it raise an error message.
#!/usr/bin/env python
import re
outputLineNumbers = open('OutputLineNumbers', 'w')
outputLine = open('OutputLine', 'w')
inputFile = open('z.vcf','r')
matchLines = inputFile.readlines()
total = 0
for i in range(len(matchLines)):
line = matchLines[i]
#print out the matched line number
if re.match('(\w+)\|(\d+)\|(\w+)\|AGTA(\d+)\.(\d)\|\s(0+\d+)\s\.\s(\w)\s(\w),(\w)', line):
total += 1
outputLineNumbers.write( str(i+1) + "\n" )
#WRITE out the matched line
if line == ('(\w+)\|(\d+)\|(\w+)\|AGTA(\d+)\.(\d)\|\s(0+\d+)\s\.\s(\w)\s(\w),(\w)'):
outputLine.write( line + "\n" )
print "total polyploid marker is : ", total
outputLineNumbers.close()
inputFile.close()
outputLine.close()
Upvotes: 0
Views: 114
Reputation: 1121466
You tried to test if the line is equal to the pattern:
if line == ('(\w+)\|(\d+)\|(\w+)\|AGTA(\d+)\.(\d)\|\s(0+\d+)\s\.\s(\w)\s(\w),(\w)'):
String equality does not magically invoke the regular expression engine when the string appears to contain a pattern, however.
Remove the if line ==
test and just write out the matched line as part of the preceding if
block:
if re.match('(\w+)\|(\d+)\|(\w+)\|AGTA(\d+)\.(\d)\|\s(0+\d+)\s\.\s(\w)\s(\w),(\w)', line):
total += 1
outputLineNumbers.write( str(i+1) + "\n" )
#WRITE out the matched line
outputLine.write( line + "\n" )
Note that you can just loop over matchLines
directly; use the enumerate()
function to produce a running index here instead:
for i, line in enumerate(matchLines, 1):
if re.match('(\w+)\|(\d+)\|(\w+)\|AGTA(\d+)\.(\d)\|\s(0+\d+)\s\.\s(\w)\s(\w),(\w)', line):
total += 1
outputLineNumbers.write("{}\n".format(i))
where i
starts at 1, so there is no need to add 1 later on either.
Upvotes: 2