Reputation: 191
I'm starting with python and pandas and matplotlib. I'm working with data with over million entries. I'm trying to change the date format. In CSV file date format is 23-JUN-11. I will like to use dates in future to plot amount of donation for each candidate. How to convert the date format to a readable format for pandas?
Here is the link to cut file 149 entries
%matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
reader_bachmann = pd.read_csv('P00000001-ALL.csv' ,converters={'cand_id': lambda x: str(x)[1:]},parse_dates=True, squeeze=True, low_memory=False, nrows=411 )
date_frame = pd.DataFrame(reader_bachmann, columns = ['contb_receipt_dt'])
s = date_frame.iloc[:,0]
date_slice = pd.Series([s])
date_strip = date_slice.str.replace('JUN','6')
date = pd.to_datetime(s, format='%d%b%Y')
print(date_slice)
ValueError: could not convert string to float: '05-JUL-11'
Upvotes: 4
Views: 4574
Reputation: 118
As an alternative you can use dateutil.parser to parse dates containing string directly, I have created a random dataframe for demo.
l = []
for i in range(100):
l.append('23-JUN-11')
B = pd.DataFrame({'Date':l})
Now, Let's import dateutil.parser and apply it on our date column
import dateutil.parser
B['Date2'] = B['Date'].apply(lambda x : dateutil.parser.parse(x))
B.head()
Out[106]:
Date Date2
0 23-JUN-11 2011-06-23
1 23-JUN-11 2011-06-23
2 23-JUN-11 2011-06-23
3 23-JUN-11 2011-06-23
4 23-JUN-11 2011-06-23
Upvotes: 2
Reputation: 49812
You need to use a different date format string:
format='%d-%b-%y'
Why?
The error message gives a clue as to what is wrong:
ValueError: could not convert string to float: '05-JUL-11'
The format string controls the conversion, and is currently:
format='%d%b%Y'
And the fields needed are:
%y - year without a century (range 00 to 99)
%b - abbreviated month name
%d - day of the month (01 to 31)
What is missing is the -
that are separating the field in your data string, and the y
for a two digit year instead of the current Y
for a four digit year.
Upvotes: 7