Reputation: 1772
I am using csv.reader
in python to read a csv file into a dictionary. The first column of the csv is a date (in one of 2 possible formats) which is read in as a datetime object and becomes the key of the dict
, and I also read columns 3 and 4:
import datetime as dt
import csv
with open(fileInput,'r') as inFile:
csv_in = csv.reader(inFile)
try:
dictData = {(dt.datetime.strptime(rows[0], '%d/%m/%Y %H:%M')): [rows[3], rows[4]]
for rows in csv_in}
except:
dictData = {(dt.datetime.strptime(rows[0], '%Y-%m-%d %H:%M:%S')): [rows[3], rows[4]]
for rows in csv_in}
It works, except that the first date in the file (1/7/2012 00:00
) doesn't appear in the dictionary. Do I need to tell csv.reader
that the first row is not a header row and if so, how?
Upvotes: 0
Views: 662
Reputation: 9968
When you run your try
, except
statement, it is easy to believe that python will first try
something, and if that fails, revert your environment back to the state it was in before the try
statement was executed. It does not do this. As such, you have to be aware of unintended side effects that might occur from a failed try
attempt.
What is happening in your case is that the dictionary comprehension calls next(...)
on your csv.reader()
object (csv_in
), which returns the first line in the csv file. You have now used up the first item from the csv.reader()
iterator. Remember, Python won't revert to a previous state if the try
block fails.
An exception is then raised, I'm presuming when the date is in the wrong format. When the except
block then takes over, and calls next(...)
on your csv_in
object, you then get the second item in the iterator. The first has already been used.
A simple change to get around this is to make a copy of the csv iterator object.
import datetime as dt
import csv
from copy import copy
with open(fileInput,'r') as inFile:
csv_in = csv.reader(inFile)
try:
dictData = {(dt.datetime.strptime(rows[0],'%d/%m/%Y %H:%M')):
[rows[3],rows[4]] for rows in copy(csv_in)}
except ValueError:
dictData = {(dt.datetime.strptime(rows[0],'%Y-%m-%d %H:%M:%S')):
[rows[3],rows[4]] for rows in copy(csv_in)}
Finally, I would recommend against catching a generic Exception
. I think you would be wanting to catch a ValueError
.
Upvotes: 1