Heisenberg
Heisenberg

Reputation: 5279

How to change date in pandas dataframe

I have dataframe like below

     day
0  2016-07-12
1  2016-08-13
2  2016-09-14
3  2016-10-15
4  2016-11-01

dtype:datetime64

I would like to change the day like below

     day
0  2016-07-01
1  2016-08-01
2  2016-09-01
3  2016-10-01
4  2016-11-01

I tried

df.day.dt.day=1

but It didnt work well How can I transform?

Upvotes: 2

Views: 7679

Answers (1)

jezrael
jezrael

Reputation: 862406

You can use numpy, first convert to numpy array by values and then convert to datetime64[M] by astype, what is the fastest solution:

df['day'] = df['day'].values.astype('datetime64[M]')
print (df)
         day
0 2016-07-01
1 2016-08-01
2 2016-09-01
3 2016-10-01
4 2016-11-01

Another slowier solution:

df['day'] = df['day'].map(lambda x: pd.datetime(x.year, x.month, 1))
print (df)
         day
0 2016-07-01
1 2016-08-01
2 2016-09-01
3 2016-10-01
4 2016-11-01

Timings:

#[50000 rows x 1 columns]
df = pd.concat([df]*10000).reset_index(drop=True)

def f(df):
    df['day'] = df['day'].values.astype('datetime64[M]')
    return df

print (f(df))    

In [281]: %timeit (df['day'].map(lambda x: pd.datetime(x.year, x.month, 1)))
10 loops, best of 3: 160 ms per loop

In [282]: %timeit (f(df))
100 loops, best of 3: 4.38 ms per loop

Upvotes: 3

Related Questions