Martin Müller
Martin Müller

Reputation: 41

Parsing to Datetime causing ValueError

I have flight data stored in a csv including for example the scheduled departure in the form 0005 (for 00:05 am). Thus in order to work with the data, I need to parse it into datetimeformat - here: "%H%M". Can you explain why it isn´t working? Thanlls for your help!!!

df['SCHEDULED_DEPARTURE'] = pd.to_datetime(df['SCHEDULED_DEPARTURE'], format="%H%M")

ValueError: time data '5' does not match format '%H%M' (match)

Upvotes: 0

Views: 69

Answers (1)

Karol Żak
Karol Żak

Reputation: 2406

The problem is with how you are reading the CSV with pandas into dataframe. I guess your SCHEDULED_DEPARTURE column gets auto converted to integer type and thus 0005 becomes just 5

# reading CSV "as is" with autoconvertion of types
pd.read_csv('test.csv', header=None)

enter image description here

# reading CSV with forcing data type for specific columns
pd.read_csv('test.csv', header=None, dtype={2:str})

enter image description here

So in your case your read_csv function should look somewhat like this:

df = pd.read_csv(
  './Originaldata/flights.csv',
  sep=',',
  usecols=['YEAR', 'MONTH', 'DAY', 'SCHEDULED_DEPARTURE'],
  dtype={'SCHEDULED_DEPARTURE':str}
)

Upvotes: 1

Related Questions