Reputation: 10031
For an excel file in which date
column is not type of date
format, so in date 2018.10
, we can see 0
has been omitted and it becomes 2018.1
.
date
2018.12
2018.11
2018.1
2018.9
2018.8
2018.7
2018.6
2018.5
2018.4
2018.3
2018.2
2018.1
How can I convert this column to year month format correctly? Thank you.
I try with df['date'] = pd.to_datetime(df['date'].map('{:.1f}'.format), format='%Y.%m')
, but I get this:
8 2018-01-01
9 2018-01-01
10 2018-01-01
11 2018-09-01
12 2018-08-01
13 2018-07-01
14 2018-06-01
15 2018-05-01
16 2018-04-01
17 2018-03-01
18 2018-02-01
Upvotes: 1
Views: 371
Reputation: 862791
First convert values to strings and then to datetimes in first step.
Then correct October - test if previous month is 11
, next is 9
and incorrect is 1
:
df['date'] = pd.to_datetime(df['date'].astype(str), format='%Y.%m')
mo = df['date'].dt.month
mask = mo.shift().eq(11) & mo.eq(1) & mo.shift(-1).eq(9)
df.loc[mask, 'date'] = df.loc[mask, 'date'] + pd.offsets.DateOffset(month=10)
print (df)
date
0 2018-12-01
1 2018-11-01
2 2018-10-01
3 2018-09-01
4 2018-08-01
5 2018-07-01
6 2018-06-01
7 2018-05-01
8 2018-04-01
9 2018-03-01
10 2018-02-01
11 2018-01-01
Upvotes: 2
Reputation: 16184
it might be easiest to fix this in the excel file! if you've got a lot of data (thousands of rows) then maybe it's worth writing code. code options are:
.1
means be January or OctoberUpvotes: 0