Anjana Karawita
Anjana Karawita

Reputation: 3

Python list : IndexError: list index out of range

I am new to python and I am currently learning. I got text file with variable number of spaces in between words per line: I am trying to read it as follows:

        import re
   ...: results = []
   ...: with open ("../../103.Immune_gene_families/Immune_genes/Human/human_immunegene.hits") as file:
   ...:     for line in file:
   ...:         if not line.startswith("#"):
   ...:             line = re.sub("\s\s+" , " ", line)
   ...:             #print(line)
   ...:             ens_id = line.split(" ")[1]
   ...:             print(ens_id)
   ...:

But I got the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-469f5598d359> in <module>
      6             line = re.sub("\s\s+" , " ", line)
      7             #print(line)
----> 8             ens_id = line.split(" ")[1]
      9             print(ens_id)
     10

IndexError: list index out of range

Example lines I get with

print(line)
['ENSG00000128016', '115', '138', '107', '147', 'TF106503', '9', '32', '5.9', '8.3', '0', 'No_clan', '']
['ENSG00000128016', '135', '169', '130', '172', 'TF317698', '454', '488', '18.0', '0.00073', '0', 'No_clan', '']
['ENSG00000128016', '137', '175', '134', '196', 'TF318914', '95', '132', '21.9', '8e-05', '0', 'No_clan', '']
['ENSG00000128016', '137', '167', '130', '173', 'TF326635', '1096', '1127', '5.7', '3.3', '0', 'No_clan', '']
['ENSG00000128016', '138', '170', '133', '173', 'TF329017', '881', '912', '5.3', '4.3', '0', 'No_clan', '']
['ENSG00000128016', '139', '166', '129', '173', 'TF105541', '764', '791', '9.3', '0.38', '0', 'No_clan', '']
['ENSG00000128016', '139', '166', '132', '172', 'TF105970', '278', '305', '8.4', '0.6', '0', 'No_clan', '']
['ENSG00000128016', '140', '170', '131', '174', 'TF314946', '110', '140', '4.5', '6.3', '0', 'No_clan', '']
['ENSG00000128016', '142', '167', '134', '184', 'TF329287', '9', '33', '6.8', '2.3', '0', 'No_clan', '']

If you could help me on this regard, much appreciated.

Thank you, AK

Upvotes: 0

Views: 591

Answers (2)

Red
Red

Reputation: 27577

You get the index error because there is less than 2 elements in line.split(" "), also meaning there was less than 2 spaces in line. Try line.split(" ")[0] instead:

import re
results = []
with open ("../../103.Immune_gene_families/Immune_genes/Human/human_immunegene.hits") as file:
    for line in file:
        if not line.startswith("#"):
            line = re.sub("\s\s+" , " ", line)
            #print(line)
            ens_id = line.split(" ")[0]
            print(ens_id)

Upvotes: 0

slightlynybbled
slightlynybbled

Reputation: 2645

Welcome to SO!

If you run

string = 'abc'
print(string.split(' '))

you will see that the result is

['abc']

If you tried to string.split(' ')[1], you would generate an IndexError.

So what is happening is that, somewhere, you likely don't have the character that you are splitting on.

Upvotes: 1

Related Questions