Ben DeMott
Ben DeMott

Reputation: 3604

Pandas referencing index column by name

I'm using pandas pandas-0.23.4-cp36-cp36m-manylinux1_x86_64.whl and I noticed that when you set a column as an index column, you can no longer reference it by name. Is there any way to reference the column after you've set it as an index? The code below raises KeyError.

import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame()

one_month = datetime.now() - timedelta(days=30)
ts_index = pd.date_range(one_month, periods=30, freq='1D')

df.insert(0, 'tscol', ts_index)
df.insert(1, 'value', 1.0)

print(df.head())

# set the timeseries column as the index.
df.set_index('tscol', inplace=True)

print(df.head())

for index, row in df.iterrows():
    print(row['tscol'])
    break

Here you can see the dataframe, before and after the tscol becomes an index:

Before

                       tscol  value
0 2018-08-19 10:53:32.412154    1.0
1 2018-08-20 10:53:32.412154    1.0
2 2018-08-21 10:53:32.412154    1.0
3 2018-08-22 10:53:32.412154    1.0
4 2018-08-23 10:53:32.412154    1.0

After

                            value
tscol                            
2018-08-19 10:53:32.412154    1.0
2018-08-20 10:53:32.412154    1.0
2018-08-21 10:53:32.412154    1.0
2018-08-22 10:53:32.412154    1.0
2018-08-23 10:53:32.412154    1.0

Gives me this Exception

Traceback (most recent call last):
  File "/home/ben/.local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3124, in get_value
    return libindex.get_value_box(s, key)
  File "pandas/_libs/index.pyx", line 55, in pandas._libs.index.get_value_box
  File "pandas/_libs/index.pyx", line 63, in pandas._libs.index.get_value_box
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "index_by_name.py", line 24, in <module>
    print(row['tscol'])
  File "/home/ben/.local/lib/python3.6/site-packages/pandas/core/series.py", line 767, in __getitem__
    result = self.index.get_value(self, key)
  File "/home/ben/.local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3132, in get_value
    raise e1
  File "/home/ben/.local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3118, in get_value
    tz=getattr(series.dtype, 'tz', None))
  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'tscol'

Upvotes: 5

Views: 6885

Answers (1)

Chris Adams
Chris Adams

Reputation: 18647

You can pass the argument drop=False when setting the index to keep it as a column in the DataFrame:

df.set_index('tscol', inplace=True, drop=False)

Upvotes: 3

Related Questions