Reading in hourly data with pd.read_csv()

Question

I'm reading in a csv file with pd.read_csv()

The data contains a financial OHLC time series, it's hourly data. The date and the hour are values seperated with a , .

Now I have two series, one for date and one for hour. Is there a way to use the two values in one series so I can convert the whole thing to a datetime object.

The read method should skip the first comma and use the two first (date and hour) values to store in a series that I will use as index.

df = pd.read_csv (csvpath, names=['Date', 'Hour', 'Open_Price','High_Price','Low_Price','Close_Price'], usecols=[0,1,2,3,4,5]) df['Date'] = pd.to_datetime(df.Date)

I tried converting the hour data to a datetime object but now it's assigning dates to it that don't match the actual data.

Thank you,

Ben Siver · Accepted Answer

This is likely an issue with the format of the dates being parsed from your CSV.

to_datetime accepts a format parameter, which is a string you should create based on the format of the dates you are reading from the CSV. Note that when you leave this parameter off, to_datetime will make a best effort to guess the format of the dates you're parsing. Here's a reference for how you can create the format string: http://strftime.org/

As a simple example: if your dates look like 02-12-19, you could use: pd.to_datetime(df.Date, format='%m-%d-%y')

Reading in hourly data with pd.read_csv()

Answers (2)

Related Questions