Reputation: 373
I'm currently reading from a csv file with pandas. I'd like for the column that has a string which resembles datetime to be turned into a DateTime object. I have restrictions on my database that specify that column should be of such type.
an Example row from my csv file is:
2015-11-28 00:10:00, 99, 21
I originaly faced the problem when I had the following.
# import CSV file
df = pd.read_csv(filename, error_bad_lines=False, warn_bad_lines=False)
# TODO Save data into db
# ADD time, banana, apple
for i in df.index:
m = AggregatedMeasurement(
time=df.loc[i, 'Time UTC'],
banana=df.loc[i, 'BANANA'],
apple=df.loc[i, 'APPLE']
)
db.session.add(m) # add to db
db.session.commit() # perform them all
I received the following error.
StatementError: (builtins.TypeError) SQLite DateTime type only accepts Python datetime and date objects as input
I started searching here and found that one can indicate date_parser and started combinations. This is what I have tried so far with the help of this suggestion. Can pandas automatically recognize dates
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
# import CSV file
df = pd.read_csv(filename, error_bad_lines=False, warn_bad_lines=False, parse_dates=['datetime'],
date_parser=dateparse)
and now I am getting the following error
ValueError: 'datetime' is not in list
Am I implementing this wrong? Thanks in advance!
UPDATE 1
After the following changes
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
# import CSV file
df = pd.read_csv(filename, error_bad_lines=False,
warn_bad_lines=False,
parse_dates=['Time UTC'],
date_parser=dateparse)
I got a TypeError: 'NoneType' object is not subscriptable
on this specific part date_parser=dateparse)
UPDATE 2 Later, I tried the next suggestion.
df = pd.read_csv(filename, error_bad_lines=False,
warn_bad_lines=False)
# Changing datatype
df['Time UTC'] = pd.to_datetime(df['Time UTC'])
And I think that worked :) I got another error but that corresponds to another database restriction relationship/ForeignKey that I haven't set up yet.
Upvotes: 3
Views: 7618
Reputation: 3483
Could be a simple mistake. I think you should do this :
df = pd.read_csv(filename,
error_bad_lines=False,
warn_bad_lines=False,
parse_dates=['Time UTC'],
date_parser=dateparse)
Indeed, you should refer parse_dates
to 'Time UTC'
which is the column of interest, and not 'datetime'
.
Otherwise, you may more easily let pd.to_datetime()
handle the task, using :
df = pd.read_csv(filename, error_bad_lines=False,
warn_bad_lines=False)
# Changing datatype
df['Time UTC'] = pd.to_datetime(df['Time UTC'])
Upvotes: 2