Pandas dataframe - creating datetime index from separate columns

Question

I read a csv with separate date and time columns. I generated an index from them in the following way:

data = p.read_csv(fileName,usecols=["date","time","price"])
data.set_index(["date","time"],inplace=True)

This however isn't very useful when I want to get the difference in days or hours between rows. How do I generate a single datetime index from the separate date and time columns?

jezrael · Accepted Answer

I think you need parameter parse_dates with nested list with both columns and parameter index_col with new column created by concanecated columns names separated by _:

data = p.read_csv(fileName,
                  usecols=["date","time","price"], 
                  parse_dates=[["date","time"]], 
                  index_col=['date_time'])

Sample:

from pandas.compat import StringIO

temp=u"""date,time,price
2015-01-01,14:00:10,7
2014-01-01,10:20:10,1"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), 
                 usecols=["date","time","price"], 
                 parse_dates=[["date","time"]],
                 index_col=['date_time'])

print (df)
                     price
date_time                 
2015-01-01 14:00:10      7
2014-01-01 10:20:10      1

print (df.index)
DatetimeIndex(['2015-01-01 14:00:10', '2014-01-01 10:20:10'], 
               dtype='datetime64[ns]', 
               name='date_time', freq=None)

Pandas dataframe - creating datetime index from separate columns

Answers (1)

Related Questions