John Davis
John Davis

Reputation: 303

Pandas date time format

Currently, I am working in 2.2 Million records. Where two column consist of membership_id and txn_time. The data frame looks like -

membership_id          txn_time
 1                      2019-02-17 00:00:00.0
 2                      2018-04-23 00:00:00.0
 3                      2018-12-17 00:00:00.0
 4                      2019-02-17 00:00:00.0
 5                      2018-04-02 00:00:00.0
 6                      2018-09-10 06:20:58.0
 7                      2019-01-16 08:11:42.0

I want the data frame looks like -

membership_id          txn_time
 1                      2019-02-17 
 2                      2018-04-23 
 3                      2018-12-17 
 4                      2019-02-17 
 5                      2018-04-02 
 6                      2018-09-10
 7                      2019-01-16 

What I have done so far -

df_txn['TXN_DATE'] = pd.to_datetime(df_txn['txn_time'], errors='coerce')

But, it's not working and the no of records is huge 2.2 million.

Thanks in advance.

Upvotes: 1

Views: 87

Answers (2)

Yoshitha Penaganti
Yoshitha Penaganti

Reputation: 464

This lambda function would help you solve the problem without using datetime library.

df['txn_time'] = df['txn_time'].apply(lambda x:x.split()[0])

Upvotes: 0

jezrael
jezrael

Reputation: 863511

For improve performance use parameter format, then convert to datetimes with no times by dt.floor, better if need process data later by datetimelike function(s):

df_txn['TXN_DATE'] = pd.to_datetime(df_txn['txn_time'], 
                                    errors='coerce',
                                    format='%Y-%m-%d %H:%M:%S.%f').dt.floor('d')

Or to python dates by dt.date, but get object:

df_txn['TXN_DATE'] = pd.to_datetime(df_txn['txn_time'], 
                                    errors='coerce',
                                    format='%Y-%m-%d %H:%M:%S.%f').dt.date

Upvotes: 1

Related Questions