NOOB
NOOB

Reputation: 2797

Pandas: datetime conversion from dtype object

I am working on a timeseries dataset which looks like this:

         DateTime         SomeVariable 
0   01/01 01:00:00          0.24244
1   01/01 02:00:00          0.84141
2   01/01 03:00:00          0.14144
3   01/01 04:00:00          0.74443
4   01/01 05:00:00          0.99999

The date is without year. Initially, the dtype of the DateTime is object and I am trying to change it to pandas datetime format. Since the date in my data is without year, on using:

df['DateTime'] = pd.to_datetime(df.DateTime)

I am getting the error OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 01:00:00

I understand why I am getting the error (as it's not according to the pandas acceptable format), but what I want to know is how I can change the dtype from object to pandas datetime format without having year in my date. I would appreciate the hints.

EDIT 1:

Since, I got to know that I can't do it without having year in the data. So this is how I am trying to change the dtype:

df = pd.read_csv(some file location)
df['DateTime'] = pd.to_datetime('2018/'+df['DateTime'], format='%y%d/%m %H:%M:%S')
df.head()

On doing that, I am getting:

ValueError: time data '2018/ 01/01 01:00:00' doesn't match format specified.

EDIT 2:

Changing the format to '%Y/%m/%d %H:%M:%S'. My data is hourly data, so it goes till 24h. I have only provided the demo data till 5h. I was getting the space on adding the year to the DateTime. In order to remove that, this is what I did:

df['DateTime'] = pd.to_datetime('2018/'+df['DateTime'][1:], format='%Y/%m/%d %H:%M:%S')

I am getting the following error for that:

ValueError: time data '2018/ 01/01  02:00:00' doesn't match format specified

On changing the format to '%y/%m/%d %H:%M:%S' with the same code, this is the error I get:

ValueError: time data '2018/ 01/01  02:00:00' does not match format '%y/%m/%d %H:%M:%S' (match)

The problem is because of the gap after the year but I am not able to get rid of it.

EDIT 3:

I am able to get rid of the space after adding the year, however I am still not able to change the dtype.

df['DateTime'] = pd.to_datetime('2018/'+df['DateTime'].str.strip(), format='%Y/%m/%d %H:%M:%S')

ValueError: time data '2018/01/01  01:00:00' doesn't match format specified

I noticed that there are 2 spaces between the date and the time in the error, however adding 2 spaces in the format doesn't help.

EDIT 4 (Solution):

Removed all the multiple whitespaces. Still the format was not matching. The problem was because of the time format. The hours were from 1-24 in my data and pandas support 0-23. Simply changed the time 24:00:00 to 00:00:00 and it works perfectly now.

Upvotes: 1

Views: 10314

Answers (2)

Victor Valente
Victor Valente

Reputation: 781

# Remove spaces. Have in mind this will remove all spaces.
df['DateTime'] = df['DateTime'].str.replace(" ", "")

# I'm assuming year does not matter and that 01/01 is in the format day/month.
df['DateTime'] = pd.to_datetime(df['DateTime'], format='%d/%m%H:%M:%S')

Upvotes: 0

jpp
jpp

Reputation: 164623

This is not possible. A datetime object must have a year.

What you can do is ensure all years are aligned for your data.

For example, to convert to datetime while setting year to 2018:

df = pd.DataFrame({'DateTime': ['01/01 01:00:00', '01/01 02:00:00', '01/01 03:00:00',
                                '01/01 04:00:00', '01/01 05:00:00']})

df['DateTime'] = pd.to_datetime('2018/'+df['DateTime'], format='%Y/%m/%d %H:%M:%S')

print(df)

             DateTime
0 2018-01-01 01:00:00
1 2018-01-01 02:00:00
2 2018-01-01 03:00:00
3 2018-01-01 04:00:00
4 2018-01-01 05:00:00

Upvotes: 1

Related Questions