ah bon
ah bon

Reputation: 10031

Float format convert to year month format in Python

For an excel file in which date column is not type of date format, so in date 2018.10, we can see 0 has been omitted and it becomes 2018.1.

date
2018.12
2018.11
2018.1 
2018.9 
2018.8 
2018.7 
2018.6
2018.5
2018.4
2018.3
2018.2
2018.1

How can I convert this column to year month format correctly? Thank you.

I try with df['date'] = pd.to_datetime(df['date'].map('{:.1f}'.format), format='%Y.%m'), but I get this:

8    2018-01-01
9    2018-01-01
10   2018-01-01
11   2018-09-01
12   2018-08-01
13   2018-07-01
14   2018-06-01
15   2018-05-01
16   2018-04-01
17   2018-03-01
18   2018-02-01

Upvotes: 1

Views: 371

Answers (2)

jezrael
jezrael

Reputation: 862791

First convert values to strings and then to datetimes in first step.

Then correct October - test if previous month is 11, next is 9 and incorrect is 1:

df['date'] = pd.to_datetime(df['date'].astype(str), format='%Y.%m')
mo = df['date'].dt.month
mask = mo.shift().eq(11) & mo.eq(1) & mo.shift(-1).eq(9)
df.loc[mask, 'date'] = df.loc[mask, 'date'] + pd.offsets.DateOffset(month=10)
print (df)
         date
0  2018-12-01
1  2018-11-01
2  2018-10-01
3  2018-09-01
4  2018-08-01
5  2018-07-01
6  2018-06-01
7  2018-05-01
8  2018-04-01
9  2018-03-01
10 2018-02-01
11 2018-01-01

Upvotes: 2

Sam Mason
Sam Mason

Reputation: 16184

it might be easiest to fix this in the excel file! if you've got a lot of data (thousands of rows) then maybe it's worth writing code. code options are:

  • look at row above/below and try and infer whether .1 means be January or October
  • ignore the column, if you have data for every month then just make up the correct sequence

Upvotes: 0

Related Questions