Reputation: 518
I am trying to change month name to date in python but i m getting an error:
ValueError: time data 'October' does not match format '%m/%d/%Y'
My CSV has values such as October in it which I want to change it to 10/01/2018
import pandas as pd
import datetime
f = pd.read_excel('test.xlsx', 'Sheet1', index_col=None)
keep_col = ['Month']
new_f = f[keep_col]
f['Month'] = f['Month'].apply(lambda v: datetime.datetime.strptime(v, '%m/%d/%Y'))
new_f.to_csv("output.csv", index=False)
Any help would be appreciated
Upvotes: 3
Views: 6102
Reputation: 173
The answer from @DYZ actually did it for me, I added the strftime to create the dict as the date string I wanted
months = {str(name).lower(): datetime.datetime(month=val, day=1, year=2016).strftime('%d/%m/%Y')
for val, name in enumerate(calendar.month_abbr) if val>0}
Upvotes: 0
Reputation: 2559
I'm assuming the data is mostly in the format you have specified (mm/dd/yyyy
) but some outlier rows have month names in them.
Without adding any extra dependencies:
DATE_FORMAT = '%m/%d/Y'
MONTH_NAME_MAP = {
"january": 1,
"jan": 1,
"february": 2,
"feb": 2,
# ...
}
def parse_month_value(value):
# check if the value is a name of a month
month_int = MONTH_NAME_MAP.get(value.lower())
if month_int:
this_year = datetime.date.today().year
return datetime.datetime(month=month_int, day=1, year=this_year)
# try to parse it normally, failing and raising exception if needed.
return datetime.datetime.strptime(value, DATE_FORMAT)
then
f['Month'] = f['Month'].apply(parse_month_value)
Upvotes: 0
Reputation: 366213
The whole point of passing a format string like %m/%d/%y
to strftime
is that you're specifying what format the input strings are going to be in.
You can see the documentation, but it's pretty obvious that a format like %m/%d/%y
is not going to handle strings like 'October'
. You're asking for a (zero-padded) month number, a slash, a (zero-padded) day number, a slash, and a (zero-padded) (two-digit) years.
If you specify a format that actually does match your input, everything works without error:
>>> datetime.datetime.strptime('October', '%B')
datetime.datetime(1900, 10, 1, 0, 0)
However, that still isn't what you want, because the default year is 1900, not 2018. So, you either need to replace
that, or pull the month out and build a new datetime object.
>>> datetime.datetime.strptime('October', '%B').replace(year=2018)
datetime.datetime(2018, 10, 1, 0, 0)
Also, notice that all of the strings that strptime
knows about are locale-specific. If you've set an English-speaking locale, like en_US.UTF-8
, or C
, then %B
means the English months, so everything is great. But if you've set, say, br_PT.UTF-8
, then you're asking it to match the Brazilian Portuguese month names, like Outubro
instead of October
.1
1. Since I don't actually know Brazilian Portuguese, that was a pretty dumb example for me to pick… but Google says it's Outubro, and when Google Translate did so ever lead wrong one?
Upvotes: 3
Reputation: 57135
As an elaboration of the answer by @AdamSmith, a better way to define a mapping between names and dates is to use the calendar
module that already has a list of names:
import calendar
table = {name: datetime.datetime(month=1, day=val, year=2018)
for val, name in enumerate(calendar.month_name) if val>0}
Upvotes: 5
Reputation: 54273
Can't you just write a function mapping to each? In fact, a dictionary will do.
def convert_monthname(monthname):
table = {"January": datetime.datetime(month=1, day=1, year=2018),
"February": datetime.datetime(month=2, day=1, year=2018),
...}
return table.get(monthname, monthname)
f['Month'] = f['Month'].apply(convert_monthname)
Upvotes: 3