yiwei
yiwei

Reputation: 4190

Why is this 'list index out of range' happening?

So I have a csv file that I am reading with Python, and this is the format of the first attribute: '2011-01-01 00:00:00' i.e. it is a string that has the date and a timestamp, separate by a space. When I call split() on this code, I get back ['2011-01-01', '00:00:00'], which very clearly is a list of size = 2.

This is the code I am working with:

for line in train_data:
    datetime = line[0]    # get first attribute of line
    datetime_array = datetime.split(' ')    # split on space
    print datetime_array[0]

The above code works fine, and prints out the dates only, in the expected format of 2011-01-01.

However, if I want to get the time string, I change my code to this:

for line in train_data:
    datetime = line[0]    # get first attribute of line
    datetime_array = datetime.split(' ')    # split on space
    print datetime_array[1]    # changed index from 0 to 1

I get an IndexError: list index out of range error thrown with the above code.

Interestingly enough, if I do this:

for line in train_data:
    datetime = line[0]
    datetime_array = datetime.split(' ')
    size = len(datetime_array)    # size = 2
    print datetime_array[size - 1]    # size - 1 = 1

The output is expected, so I get 00:00:00

Can someone tell me why this happens? Why do I get the error when I explicitly specify the index?

Upvotes: 0

Views: 1761

Answers (2)

yiwei
yiwei

Reputation: 4190

I figured it out. It was throwing that IndexError: list index out of range error because the first line of my csv file contained the attribute name, in this case, datetime. Of course, this means the list is just ['datetime'], which has size = 1.

Upvotes: 0

Reut Sharabani
Reut Sharabani

Reputation: 31339

Your code is fine under the assumption there is a whitespace between the date and the hour.

The problem is - somewhere there isn't.

To find out where, and why, use this:

line_number = 1
for line in train_data:
    datetime = line[0]    # get first attribute of line
    datetime_array = datetime.split(' ')    # split on space
    if len(datetime_array) < 2:
        print "The following line does not conform to expected format:"
        print line
        print "line number: %d" % line_number
    line_number += 1

This will print all lines that are not in the format you expect.

Upvotes: 1

Related Questions