user3368952
user3368952

Reputation: 1

Parsing a file with Python

I'm trying to parse an alignment output file using python, but I am having some problems with that.

The file has three main "blocks" of information, and I want to get the third block values, which have this structure:

  I    J ILEN JLEN     MATCH  NGAPS  NALIG NIDENT    %IDENT       NAS     NASAL  NRANS     RMEAN     STDEV     SCORE

1    2  177  104    433.00      7    104     20     19.23    416.35    335.58      0      0.00      0.00      0.00      1

1    3  177  107    427.00      6    107     21     19.63    399.07    331.78      0      0.00      0.00      0.00      2

1    4  177  126    480.00      4    126     15     11.90    380.95    342.86      0      0.00      0.00      0.00      3

So I have written this:

infile=open('ig_pairs.out')

init=infile.readline()

while init[:6] !='  1542':

    line=infile.readline()

colnames=['I', 'J', 'ILEN', 'JLEN', 'MATCH', 'NGAPS', 'NALIG', 'NIDENT', '%IDEN$

file=open('ig_file.txt', 'w')

for c in colnames:
    file.write(c + '\t')

for line in infile.readline():

    I=line[5:7]
    J=line[9:11]
    ILEN=line[14:16]
    JLEN=line[19:21]
    MATCH=line[26:31]
    NGAPS=line[37:38]
    NALIG=line[43:45]
    NIDENT=line[50:52]
    IDEN=line[58:62]
    NAS=line[68:72]
    NASAL=line[77:82]
    NRANS=line[87:89]
    RMEAN=line[93:99]
    STDEV=line[105:109]
    SCORE=line[114:119]
    NUMBER=line[120:126]

   file.write('\n' + I + '\t' + J + '\t' + ILEN + '\t' + JLEN + '\t' + MATCH + '\t' + NGAPS + '\t' + NALIG + '\t' + NIDENT + '\t' + IDEN + '\t' + NAS + '\t' + NASAL + '\t' + NRANS + '\t' + RMEAN + '\t' + STDEV + '\t' + SCORE + '\t' + NUMBER)

file.close()

But for some reason it is not working. I don't get back any error message, the terminal just get blocked and nothing happens.

Any idea of what's wrong?

Upvotes: 0

Views: 65

Answers (2)

Kartik Anand
Kartik Anand

Reputation: 4609

I guess the problem is with:

for line in infile.readline()

You're reading just one line, not all lines, thus line will be a character not a single line, because python will think you're iterating on characters.

And thus line[5:7] won't amount to anything.

Use

for line in infile.readlines()

Upvotes: 2

patsweet
patsweet

Reputation: 1558

It appears that this is creating an infinite loop:

while init[:6] !='  1542':
    line=infile.readline()

You need to reset the init variable, else this could always evaluate to true.

Change line = infile.readline() to init = infile.readline().

Upvotes: 2

Related Questions