sheaph
sheaph

Reputation: 199

Extracting Data With Python

I am new to python and I am trying to extract data from a large unsorted text file. I would like to know if it is possible to extract all the data on a line where a single word "stop_codon" occurs through the text document. this is what i have so far...

import re
regex = re.compile("stop_codon([^U]+)")

contigdata = open("contigs.txt").read()

for match in regex.finditer(contigdata):
    rules = match.group(0).splitlines()
    for rule in rules:
        if rule and not rule.startswith("#"):
            print rule

This is the output that the script is producing and i would prefer if it was all on the one line.

contig00002 A
stop_codon  2076    2078    .   +   0   transcript_id "g2.t1"; gene_id "g2";

Any help would be gratefully appreciated!

Upvotes: 1

Views: 132

Answers (1)

thefourtheye
thefourtheye

Reputation: 239473

If you just want to print all the output in a single line

change

print rule

to

print rule,

We dont really need regular expressions for this

with open("contigs.txt") as f:
    for line in f:
        if "stop_codon" in line and line[0] != "#":
            print line,

Upvotes: 1

Related Questions