Nick Vence
Nick Vence

Reputation: 770

Pandas: Extracting values from a DatetimeIndex

I have a Pandas DataFrame whose rows and columns are a DatetimeIndex.

import pandas as pd

data = pd.DataFrame(
    {
        "PERIOD_END_DATE": pd.date_range(start="2018-01", end="2018-04", freq="M"),
        "first": list("abc"),
        "second": list("efg")
    }
).set_index("PERIOD_END_DATE")

data.columns = pd.date_range(start="2018-01", end="2018-03", freq="M")
data

DataFrame

Unfortunately, I am getting a variety of errors when I try to pull out a value:

data['2018-01', '2018-02']       # InvalidIndexError: ('2018-01', '2018-02')
data['2018-01', ['2018-02']]     # InvalidIndexError: ('2018-01', ['2018-02'])
data.loc['2018-01', '2018-02']   # TypeError: only integer scalar arrays can be converted to a scalar index
data.loc['2018-01', ['2018-02']] # KeyError: "None of [Index(['2018-02'], dtype='object')] are in the [columns]" 

How do I extract a value from a DataFrame that uses a DatetimeIndex?

Upvotes: 1

Views: 2276

Answers (4)

Nick Vence
Nick Vence

Reputation: 770

Timestamp indexes are finicky. Pandas accepts each of the following expressions, but they return different types.

    data.loc['2018-01',['2018-01-31']]
    data.loc['2018-01-31',['2018-01-31']]
    data.loc['2018-01','2018-01-31']
    data.loc['2018-01-31','2018-01']
    data.loc['2018-01-31','2018-01-31']

enter image description here

Upvotes: 0

Kevin
Kevin

Reputation: 36

There are 2 issues:

  1. Since, you are using a DateTimeIndex dataframe, the correct notation to traverse between rows and columns are:
a) data.loc[rows_index_name, [column__index_name]]

or

b) data.loc[rows_index_name, column__index_name]

depending on the type of output you desire.

Notation A will return a series value, while notation (b) returns a string value.

  1. The index names can not be amputated- you must specify the whole string.

As such, your issue will be resolved with:

data.loc['2018-01-31',['2018-01-31']] or data.loc['2018-01-31','2018-01-31']

Upvotes: 2

Naveed
Naveed

Reputation: 11650

you are accessing as a period (YYYY-MM) on a date columns. This would help in this case


data.columns = pd.period_range(start="2018-01", end="2018-02", freq='M')
data[['2018-01']] 


                  2018-01
PERIOD_END_DATE     
     2018-01-31     a
     2018-02-28     b
     2018-03-31     c

Upvotes: 0

SHENOOOO
SHENOOOO

Reputation: 56

As long as you already set the date as index, you will not be able to slice or extract any data of it. You can extract the month and date of it as it is a regular column not when it is an index. I had this before and that was the solution.

I kept it as a regular column, extracted the Month, Day and Year as a seperate column for each of them, then I assigned the date column as the index column.

Upvotes: 0

Related Questions