IndexError: string index out of range python, numpy

Question

Have a long code that I've been tearing my hair out over. If I run an input file

# 96.52 0.0036
#
#
0.860    9.38   0.938   35   I      I_band
1.235    6.452   0.030  41   J      2MASS 
1.66     5.471   0.021  42   H      2MASS 
2.16     5.069   0.023  43   K      2MASS 
9.0          9.760e-01   8.51e-03   0    AKARI09     0.52
18.0     2.609e-01   3.67e-02   0    AKARI18     0.52
#

I get

File 'myfile.py', line 811, in 
err=np.append(err,data[i][2])
IndexError: string index out of range

However, if I run

# 96.52 0.0036
#
#
0.860    9.38   0.938   35   I      I_band
1.235    6.452   0.030  41   J      2MASS 0.134
1.66     5.471   0.021  42   H      2MASS 0.134
2.16     5.069   0.023  43   K      2MASS 0.134
9.0          9.760e-01   8.51e-03   0    AKARI09     0.52
18.0     2.609e-01   3.67e-02   0    AKARI18     0.52
#

The code works as it should. Both those examples are saved as a .dat file that I am prompted to direct the code towards.

I've been trying to figure this out for maybe 24 hours now (I know, right?) with absolutely no success. I can't pinpoint my problem. ANY advice would be welcomed at this point. Thanks as always!

EDIT: if I change (around the 800 line) the xranges to 'range' and extend back to append (see lines 792 and 798) I will now get this error:

File 'myfile.py', line 807, in 
 if(data[i]=='#'): comments=comments+1
IndexError: list index out of range

tiago · Accepted Answer

I don't have time to follow through the 1000+ lines of your code (and I doubt many will). But from what I could glance, you seem to be trying to reinvent the wheel in the way you read your files. You are getting the error because your data array in some columns does not have the type/size you expect.

I'd suggest getting acquainted with numpy's loadtxt or genfromtxt functions. You can probably get most of the file's data in the format that you want with a single call. (All the open calls in your code seem to be in binary, so I don't know how the text file enters.) I don't know exactly what your format is, but for example you do something like this:

import numpy as np
result = np.genfromtxt('file', dtype=[('wave','f'), ('flux', 'f'),
                                      ('err', 'f'), ('code', 'i'), 
                                      ('band', 'S8'), ('survey', 'S8')])

The result is a structured array which you can index by the dtype strings:

In [16]: result['wave']
Out[16]:
array([  0.86000001,   1.23500001,   1.65999997,   2.16000009,
         9.        ,  18.        ], dtype=float32)

In [17]: result['err']
Out[17]:
array([ 0.93800002,  0.03      ,  0.021     ,  0.023     ,  0.00851   ,
        0.0367    ], dtype=float32)

In [18]: result['band']
Out[18]:
array(['I', 'J', 'H', 'K', 'AKARI09', 'AKARI18'],
      dtype='|S8')

Here I saved the last column as string, so you may have to convert the numbers in the last two rows.

IndexError: string index out of range python, numpy

Answers (1)

Related Questions