Jonathon
Jonathon

Reputation: 271

Getting phantom 'b' when converting numpy object array to datetime

I have a large data set with many columns. I am making each column an array. The first column is time in $H:$M:$S

00:00:00
00:00:01
00:00:02
...
23:59:58
23:59:59

When I put this into an array, it makes an object array. I use this to convert it to datetime:

time1=np.array2string(time)                  
dt.datetime.strptime(time1, "%H:%M:%S")

However, I keep getting an error:

ValueError: time data "[b'00:00:00' b'00:01:00' b'00:02:00' ... b'23:57:00' b'23:58:00'\n b'23:59:00']" does not match format '%H:%M:%S'

When I look at the actual array, it indeed does have that phantom 'b', but there is no 'b' in my dataset. It generates it out of thin air. What is causing this?

UPDATE:

I tried

time1=np.array2string(time)                  
time_strings = [dt.datetime.strptime(t, "%H:%M:%S") for t in time1]

and received the error:

ValueError: time data '[' does not match format '%H:%M:%S'

Not sure why a bracket is in there. It still appears to be making a 'b'.

Upvotes: 0

Views: 71

Answers (1)

FObersteiner
FObersteiner

Reputation: 25544

your input seems to be an array of byte objects. you'll need to decode the bytes to string before you can parse them with strptime. example:

from datetime import datetime
import numpy as np

time = np.array([b'00:00:00', b'00:00:01', b'00:00:02'])

dt_list = [datetime.strptime(t.decode(encoding='utf-8'), "%H:%M:%S") for t in time]

# dt_list 
# [datetime.datetime(1900, 1, 1, 0, 0),
#  datetime.datetime(1900, 1, 1, 0, 0, 1),
#  datetime.datetime(1900, 1, 1, 0, 0, 2)]

note: 'utf-8' is the default, adjust if you have a different encoding.

Upvotes: 1

Related Questions