Scott
Scott

Reputation: 1194

Using datetime values as a pandas index and then obtaining that date value for a row

I'm learning to use pandas and I'm parsing NOAA daily observations: (truncated here for clarity)

import pandas as pd
import StringIO

csv_data = """
date,maxt,mint,avgt,pcpn,snow,snwd,hdd,cdd
1872-01-01,48,28,38.0,0.00,M,M,27,0
1872-01-02,43,28,35.5,0.00,M,M,29,0
1872-01-03,47,25,36.0,0.00,M,M,29,0
1872-01-04,39,22,30.5,0.00,M,M,34,0
1872-01-05,37,15,26.0,0.03,M,M,39,0
"""
fake_csv_file = StringIO.StringIO(csv_data)

df = pd.read_csv(fake_csv_file, parse_dates=['date'], index_col='date')

When I check df.index, it appears that my index is comprised of datetime values:

>>> df.index
DatetimeIndex(['1872-01-01', '1872-01-02', '1872-01-03', '1872-01-04',
               '1872-01-05'],
              dtype='datetime64[ns]', name=u'date', freq=None)

Now that my date value is the index instead of a column, I can't figure out how to access the date value. I can select a row:

>>> first_row = df.loc['1872-01-01']
>>> print first_row
maxt    48
mint    28
avgt    38
pcpn     0
snow     M
snwd     M
hdd     27
cdd      0
Name: 1872-01-01 00:00:00, dtype: object

Now I'd like to programmatically get that date value, but first_row.index returns something I didn't expect:

>>> first_row.index
Index([u'maxt', u'mint', u'avgt', u'pcpn', u'snow', u'snwd', u'hdd', u'cdd'], dtype='object')

I expected that first_row.index would return the datetime value, but instead it returns this list of all the columns.

Did I do something wrong? What am I missing?

In case my question isn't clear, I'd like to be able to get to the date value for a row the way I can for any of the columns:

>>> df.maxt
48
>>> df.mint
28

Obviously, this returns a key error:

>>> df.date # <- something like this?

Also, in case anyone asks, I might want to get to the date value so I can use some of the dt goodies like dayofyear or dayofweek.

Upvotes: 2

Views: 55

Answers (1)

jezrael
jezrael

Reputation: 862801

I think you need name of Series what is scalar value:

first_row = df.loc['1872-01-01']
print (first_row.name)
1872-01-01 00:00:00

Then use:

print (first_row.name.dayofyear)
1
print (first_row.name.dayofweek)
0

Upvotes: 1

Related Questions