Reputation: 4190
So I have a csv
file that I am reading with Python, and this is the format of the first attribute: '2011-01-01 00:00:00'
i.e. it is a string that has the date and a timestamp, separate by a space. When I call split()
on this code, I get back ['2011-01-01', '00:00:00']
, which very clearly is a list of size = 2
.
This is the code I am working with:
for line in train_data:
datetime = line[0] # get first attribute of line
datetime_array = datetime.split(' ') # split on space
print datetime_array[0]
The above code works fine, and prints out the dates only, in the expected format of 2011-01-01
.
However, if I want to get the time string, I change my code to this:
for line in train_data:
datetime = line[0] # get first attribute of line
datetime_array = datetime.split(' ') # split on space
print datetime_array[1] # changed index from 0 to 1
I get an IndexError: list index out of range
error thrown with the above code.
Interestingly enough, if I do this:
for line in train_data:
datetime = line[0]
datetime_array = datetime.split(' ')
size = len(datetime_array) # size = 2
print datetime_array[size - 1] # size - 1 = 1
The output is expected, so I get 00:00:00
Can someone tell me why this happens? Why do I get the error when I explicitly specify the index?
Upvotes: 0
Views: 1761
Reputation: 4190
I figured it out. It was throwing that IndexError: list index out of range
error because the first line of my csv file contained the attribute name, in this case, datetime
. Of course, this means the list is just ['datetime']
, which has size = 1
.
Upvotes: 0
Reputation: 31339
Your code is fine under the assumption there is a whitespace between the date and the hour.
The problem is - somewhere there isn't.
To find out where, and why, use this:
line_number = 1
for line in train_data:
datetime = line[0] # get first attribute of line
datetime_array = datetime.split(' ') # split on space
if len(datetime_array) < 2:
print "The following line does not conform to expected format:"
print line
print "line number: %d" % line_number
line_number += 1
This will print all lines that are not in the format you expect.
Upvotes: 1