Reputation: 199
I am new to python and I am trying to extract data from a large unsorted text file. I would like to know if it is possible to extract all the data on a line where a single word "stop_codon" occurs through the text document. this is what i have so far...
import re
regex = re.compile("stop_codon([^U]+)")
contigdata = open("contigs.txt").read()
for match in regex.finditer(contigdata):
rules = match.group(0).splitlines()
for rule in rules:
if rule and not rule.startswith("#"):
print rule
This is the output that the script is producing and i would prefer if it was all on the one line.
contig00002 A
stop_codon 2076 2078 . + 0 transcript_id "g2.t1"; gene_id "g2";
Any help would be gratefully appreciated!
Upvotes: 1
Views: 132
Reputation: 239473
If you just want to print all the output in a single line
change
print rule
to
print rule,
We dont really need regular expressions for this
with open("contigs.txt") as f:
for line in f:
if "stop_codon" in line and line[0] != "#":
print line,
Upvotes: 1