guy tal
guy tal

Reputation: 1

Python- an error- string index out of range

f=open('sequence3.fasta', 'r')
str=''

for line in f:
    line2=line.rstrip('\n')
    if (line2[0]!='>'):
        str=str+line2
    elif (len(line)==0):
        break

str.rstrip('\n') 
f.close()

The script is suppose to read 3 DNA sequences and connect them to one sequence. The problem is, I get this error:

IndexError: string index out of range

And when I write like this:

f=open('sequence3.fasta', 'r')
str=''

for line in f:
    line.rstrip('\n')
    if (line[0]!='>'):
        str=str+line
    elif (len(line)==0):
        break

str.rstrip('\n') 
f.close()

It runs but there are spaces in between. Thanks

Upvotes: 0

Views: 437

Answers (4)

Andrew_CS
Andrew_CS

Reputation: 2562

You shouldn't use your second code example where you don't save the return value of rstrip. rstrip doesn't modify the original string that it was used on. RStrip - Return a copy of the string with trailing characters removed..

Also in your if else statement your first condition that you check should be for length 0, otherwise you'll get an error for checking past the strings length.

Additionally, having a break in your if else statements will end your loop early if you have an empty line. Instead of breaking you could just not do anything if there is 0 length.

if (len(line2) != 0):
    if (line2[0] != '>'):
        str = str+line2

Also your line near the end str.rstrip('\n') isn't doing anything since the return value of rstrip isn't saved.

Upvotes: 0

Matt
Matt

Reputation: 173

line.rstrip('\n')

Returns copy of line, and you do nothing with it. It doesn't change "line".

Exception "IndexError: string index out of range" means that "line[0]" cannot be referenced -- so "line" must be empty. Perhaps you should make it like this:

for line in f:
    line = line.rstrip('\n')
    if line:
        if (line[0]!='>'):
            str=str+line
    else:
        break

Upvotes: 0

Your empty line condition is in wrong place. Try:

for line in f:
    line = line.rstrip('\n')

    if len(line) == 0: # or simply: if not line:
        break

    if line[0] != '>':
        str=str+line

Or another solution is to use the .startswith: if not line.startswith('>')

Upvotes: 0

BartoszKP
BartoszKP

Reputation: 35891

The second version doesn't crash because the line line.rstrip('\n') is a NOOP. rtrip returns a new string, and doesn't modify the existing one (line). The first version crashes because probably you have empty lines in your input file so line.rstrip returns an empty line. Try this:

f=open('sequence3.fasta', 'r')
str=''

for line in f:
    line2=line.rstrip('\n')
    if line2 and line2[0]!='>':
        str=str+line2
    elif len(line)==0:
        break

if line2 is an equivalent of if len(line2) > 0. Similarly, you could replace your elif len(line)==0 with elif not line.

Upvotes: 2

Related Questions