Nilani Algiriyage
Nilani Algiriyage

Reputation: 35716

convert to datetime64 format with to_datetime()

I'm trying to convert some date time data in to pandas.to_datetime() format. It is not working and the type of df['Time'] is Object. Where is wrong?

Please Note that I have attached my time file.

My Code

import pandas as pd
import numpy as np
from datetime import datetime

f = open('time','r')
lines = f.readlines()

t = []
for line in lines:
    time = line.split()[1][-20:]
    time2 = time[:11] + ' ' +time[12:21]
    t.append(time2)



df = pd.DataFrame(t)
df.columns = ['Time']
df['Time'] = pd.to_datetime(df['Time'])

print df['Time'] 


Name: Time, Length: 16136, dtype: object

please find the attach time data file here

Upvotes: 2

Views: 2374

Answers (1)

falsetru
falsetru

Reputation: 369244

The file time contain some invalid data.

For example, line 8323 contain 8322 "5/Jul/2013::8:25:18 0530", which is different from normal lines 8321 "15/Jul/2013:18:25:18 +0530".

8321 "15/Jul/2013:18:25:18 +0530"
8322 "5/Jul/2013::8:25:18  0530"

For normal line, time2 become 15/Jul/2013 18:25:18, but for invalid line "5/Jul/2013::8:25:18

15/Jul/2013 18:25:18
"5/Jul/2013::8:25:18

Which cause some lines are parsed to datetime, and some lines not; data are coerced to object (to contain both datetime and string).

>>> pd.Series(pd.to_datetime(['15/Jul/2013 18:25:18', '15/Jul/2013 18:25:18']))
0   2013-07-15 18:25:18
1   2013-07-15 18:25:18
dtype: datetime64[ns]

>>> pd.Series(pd.to_datetime(['15/Jul/2013 18:25:18', '*5/Jul/2013 18:25:18']))
0    15/Jul/2013 18:25:18
1    *5/Jul/2013 18:25:18
dtype: object

If you take only first 5 data (which has correct date format) from files, you will get what you expected.

...
df = pd.DataFrame(t[:5])
df.columns = ['Time']
df['Time'] = pd.to_datetime(df['Time'])

Above code yield:

0   2013-07-15 00:00:12
1   2013-07-15 00:00:18
2   2013-07-15 00:00:23
3   2013-07-15 00:00:27
4   2013-07-15 00:00:29
Name: Time, dtype: datetime64[ns]

UPDATE

Added a small example that show the cause of dtype of object, not datetime.

Upvotes: 3

Related Questions