Niam45
Niam45

Reputation: 578

Time Data does not match format "'%H:%M.%S%f'"

I am trying to forecast time series data.
The time series data in my csv file is in the form 0:00.000 Hence, I indexed the time series data column as follows:

df.columns=['Elapsed','I']
df['Elapsed']=pd.to_datetime(df['Elapsed'], format='%H:%M.%S%f')
df['Elapsed']=df['Elapsed'].dt.time
df.set_index('Elapsed', inplace=True)

Then later I split my data into the test section and the train section

train = df.loc['0:00.000':'0:28.778']
test = df.loc['0:28.779':] 

My stack trace is enter image description here enter image description here An extract of my data is:

enter image description here

Can anyone explain how to prevent this error from occuring?

Upvotes: 1

Views: 2078

Answers (2)

lxop
lxop

Reputation: 8595

Since the question has now changed, I'll write a new answer.

Your dataframe is indexed by instances of datetime.time, but you're trying to slice it with strings - pandas doesn't want to compare strings with times.

To get your slicing to work, try this:

split_from = datetime.datetime.strptime('0:00.000', '%H:%M.%S%f').time()
split_to = datetime.datetime.strptime('0:28.778', '%H:%M.%S%f').time()
train = df[split_from:split_to]

It would also be useful to hold the format in a variable since you're now using it in several places.

Or if you have fixed split times, you could instead do

split_from = datetime.time(0, 0, 0)
split_to = datetime.time(0, 28, 77.8)
train = df[split_from:split_to]

Upvotes: 1

lxop
lxop

Reputation: 8595

Without seeing your data, I'm just guessing, but here goes:

I'm guessing your original data in the 'Elapsed' column looks like

'12:34.5678'
'12:35.1234'

In particular, it has quotes each side of the numbers. Otherwise your line

df['Elapsed']=pd.to_datetime(df['Elapsed'], format="'%H:%M.%S%f'")

would fail.

So the error message is telling you that your slicing times have the wrong format: they are missing quotes on each side. Change it to

train = df.loc["'0:00.000'":"'0:28.778'"]

(likewise for the next line) and hopefully that will sort it out.

If you can extract your source data in a way that avoids having quote characters in the timestamps, you'll probably find things a little simpler.

Upvotes: 0

Related Questions