Reputation:
Is it possible to check if a pandas dataframe is indexed? Check if DataFrame.set_index(...)
was ever called on the dataframe? I could check if df.index
is a numeric list but that's not a perfect test for this.
Upvotes: 1
Views: 3117
Reputation: 183
The following worked for me, I do set_index([label], append=False) if the dataframe has the default RangeIndex, or set_index([label], append=True) otherwise.
append = not isinstance(df.index, pd.RangeIndex)
df.set_index([label], drop=True, append=append, inplace=True)
So my assumption, is that when index is the default RangeIndex, that setting another column as an index, I can drop the RangeIndex.
Upvotes: 0
Reputation: 21
I just ran into this myself. The problem is that a dataframe is indexed before calling .set_index()
, so the question is really whether or not the index is named. In which case, df.index.name
appears to be less reliable than df.index.names
>>> import pandas as pd
>>> df = pd.DataFrame({"id1": [1, 2, 3], "id2": [4,5,6], "word": ["cat", "mouse", "game"]})
>>> df
id1 id2 word
0 1 4 cat
1 2 5 mouse
2 3 6 game
>>> df.index
RangeIndex(start=0, stop=3, step=1)
>>> df.index.name, df.index.names[0]
(None, None)
>>> "indexed" if df.index.names[0] else "no index"
'no index'
>>> df1 = df.set_index("id1")
>>> df1
id2 word
id1
1 4 cat
2 5 mouse
3 6 game
>>> df1.index
>>> df1.index.name, df1.index.names[0]
('id1', 'id1')
Int64Index([1, 2, 3], dtype='int64', name='id1')
>>> "indexed" if df1.index.names[0] else "no index"
'indexed'
>>> df12 = df.set_index(["id1", "id2"])
>>> df12
word
id1 id2
1 4 cat
2 5 mouse
3 6 game
>>> df12.index
MultiIndex([(1, 4),
(2, 5),
(3, 6)],
names=['id1', 'id2'])
>>> df12.index.name, df12.index.names[0]
(None, 'id1')
>>> "indexed" if df12.index.names[0] else "no index"
'indexed'
Upvotes: 2
Reputation: 375715
One way would be to compare it to the plain Index:
pd.Index(np.arange(0, len(df))).equals(df.index)
For example:
In [11]: df = pd.DataFrame([['a', 'b'], ['c', 'd']], columns=['A', 'B'])
In [12]: df
Out[12]:
A B
0 a b
1 c d
In [13]: pd.Index(np.arange(0, len(df))).equals(df.index)
Out[13]: True
and if it's not the plain index, it will return False:
In [14]: df = df.set_index('A')
In [15]: pd.Index(np.arange(0, len(df))).equals(df.index)
Out[15]: False
Upvotes: 5