Reputation: 5504
How do I get the index column name in Python's pandas? Here's an example dataframe:
Column 1
Index Title
Apples 1
Oranges 2
Puppies 3
Ducks 4
What I'm trying to do is get/set the dataframe's index title. Here is what I tried:
import pandas as pd
data = {'Column 1' : [1., 2., 3., 4.],
'Index Title': ["Apples", "Oranges", "Puppies", "Ducks"]}
df = pd.DataFrame(data)
df.index = df["Index Title"]
del df["Index Title"]
Anyone know how to do this?
Upvotes: 436
Views: 1114263
Reputation: 23011
pd.Index
to name an index (or column) from constructionPandas has Index
(MultiIndex
) objects that accepts names. Passing those as index or column on dataframe construction constructs frames with named indices/columns.
data = {'Column 1': [1,2,3,4], 'Index Title': ["Apples","Oranges","Puppies","Ducks"]}
# for RangeIndex
df = pd.DataFrame(data, index=pd.Index(range(4), name='foo'))
# ^^^^^^^^ <---- here
# for Index
df = pd.DataFrame(data, index=pd.Index(data['Index Title'], name='foo'))
# ^^^^^^^^ <---- here
# for columns
df = pd.DataFrame(data, columns=pd.Index(data.keys(), name='foo'))
# ^^^^^^^^ <---- here
# for MultiIndex
df = pd.DataFrame(data, index=pd.MultiIndex.from_arrays([['Fruit', 'Fruit', 'Animal', 'Animal'], data['Index Title']], names=['foo', 'bar']))
# ^^^^^^^^^^^^^ <---- here
If the dataframe has MultiIndex and an index name at a specific level has to be changed, index.set_names
may be used. For example, to change the name of the second index level, use the following. Don't forget inplace=True
.
df.index.set_names('foo', level=1, inplace=True)
# equivalently, rename could be used with a dict
df.index.rename({'Index Title 2': 'foo'}, inplace=True)
set_names
can also be used for just regular index (set level=None
). However, rename_axis
is probably easier.
df.index.set_names('foo', level=None, inplace=True)
# equivalent to the following
df.index.name = 'foo'
df = df.rename_axis('foo')
There's a corresponding columns.set_names
for columns.
df.columns.set_names('foo', level=None, inplace=True)
# equivalent to
df = df.rename_axis(columns='foo')
# for MultiIndex columns
df.columns.set_names('foo', level=0, inplace=True)
Upvotes: 3
Reputation: 510
Setting the index name can also be accomplished at creation:
pd.DataFrame(data={'age': [10,20,30], 'height': [100, 170, 175]}, index=pd.Series(['a', 'b', 'c'], name='Tag'))
Upvotes: 16
Reputation: 2598
To just get the index column names df.index.names
will work for both a single Index or MultiIndex as of the most recent version of pandas.
As someone who found this while trying to find the best way to get a list of index names + column names, I would have found this answer useful:
names = list(filter(None, df.index.names + df.columns.values.tolist()))
This works for no index, single column Index, or MultiIndex. It avoids calling reset_index() which has an unnecessary performance hit for such a simple operation. I'm surprised there isn't a built in method for this (that I've come across). I guess I run into needing this more often because I'm shuttling data from databases where the dataframe index maps to a primary/unique key, but is really just another column to me.
Upvotes: 2
Reputation: 31898
The solution for multi-indexes is inside jezrael's cyclopedic answer, but it took me a while to find it so I am posting a new answer:
df.index.names
gives the names of a multi-index (as a Frozenlist).
Upvotes: 6
Reputation: 862406
You can use rename_axis
, for removing set to None
:
d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title')
print (df)
Column 1
Index Title
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
print (df.index.name)
Index Title
print (df.columns.name)
None
The new functionality works well in method chains.
df = df.rename_axis('foo')
print (df)
Column 1
foo
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
You can also rename column names with parameter axis
:
d = {'Index Title': ['Apples', 'Oranges', 'Puppies', 'Ducks'],'Column 1': [1.0, 2.0, 3.0, 4.0]}
df = pd.DataFrame(d).set_index('Index Title').rename_axis('Col Name', axis=1)
print (df)
Col Name Column 1
Index Title
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
print (df.index.name)
Index Title
print (df.columns.name)
Col Name
print df.rename_axis('foo').rename_axis("bar", axis="columns")
bar Column 1
foo
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
print df.rename_axis('foo').rename_axis("bar", axis=1)
bar Column 1
foo
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
From version pandas 0.24.0+
is possible use parameter index
and columns
:
df = df.rename_axis(index='foo', columns="bar")
print (df)
bar Column 1
foo
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
Removing index and columns names means set it to None
:
df = df.rename_axis(index=None, columns=None)
print (df)
Column 1
Apples 1.0
Oranges 2.0
Puppies 3.0
Ducks 4.0
If MultiIndex
in index only:
mux = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
list('abcd')],
names=['index name 1','index name 1'])
df = pd.DataFrame(np.random.randint(10, size=(4,6)),
index=mux,
columns=list('ABCDEF')).rename_axis('col name', axis=1)
print (df)
col name A B C D E F
index name 1 index name 1
Apples a 5 4 0 5 2 2
Oranges b 5 8 2 5 9 9
Puppies c 7 6 0 7 8 3
Ducks d 6 5 0 1 6 0
print (df.index.name)
None
print (df.columns.name)
col name
print (df.index.names)
['index name 1', 'index name 1']
print (df.columns.names)
['col name']
df1 = df.rename_axis(('foo','bar'))
print (df1)
col name A B C D E F
foo bar
Apples a 5 4 0 5 2 2
Oranges b 5 8 2 5 9 9
Puppies c 7 6 0 7 8 3
Ducks d 6 5 0 1 6 0
df2 = df.rename_axis('baz', axis=1)
print (df2)
baz A B C D E F
index name 1 index name 1
Apples a 5 4 0 5 2 2
Oranges b 5 8 2 5 9 9
Puppies c 7 6 0 7 8 3
Ducks d 6 5 0 1 6 0
df2 = df.rename_axis(index=('foo','bar'), columns='baz')
print (df2)
baz A B C D E F
foo bar
Apples a 5 4 0 5 2 2
Oranges b 5 8 2 5 9 9
Puppies c 7 6 0 7 8 3
Ducks d 6 5 0 1 6 0
Removing index and columns names means set it to None
:
df2 = df.rename_axis(index=(None,None), columns=None)
print (df2)
A B C D E F
Apples a 6 9 9 5 4 6
Oranges b 2 6 7 4 3 5
Puppies c 6 3 6 3 5 1
Ducks d 4 9 1 3 0 5
For MultiIndex
in index and columns is necessary working with .names
instead .name
and set by list or tuples:
mux1 = pd.MultiIndex.from_arrays([['Apples', 'Oranges', 'Puppies', 'Ducks'],
list('abcd')],
names=['index name 1','index name 1'])
mux2 = pd.MultiIndex.from_product([list('ABC'),
list('XY')],
names=['col name 1','col name 2'])
df = pd.DataFrame(np.random.randint(10, size=(4,6)), index=mux1, columns=mux2)
print (df)
col name 1 A B C
col name 2 X Y X Y X Y
index name 1 index name 1
Apples a 2 9 4 7 0 3
Oranges b 9 0 6 0 9 4
Puppies c 2 4 6 1 4 4
Ducks d 6 6 7 1 2 8
Plural is necessary for check/set values:
print (df.index.name)
None
print (df.columns.name)
None
print (df.index.names)
['index name 1', 'index name 1']
print (df.columns.names)
['col name 1', 'col name 2']
df1 = df.rename_axis(('foo','bar'))
print (df1)
col name 1 A B C
col name 2 X Y X Y X Y
foo bar
Apples a 2 9 4 7 0 3
Oranges b 9 0 6 0 9 4
Puppies c 2 4 6 1 4 4
Ducks d 6 6 7 1 2 8
df2 = df.rename_axis(('baz','bak'), axis=1)
print (df2)
baz A B C
bak X Y X Y X Y
index name 1 index name 1
Apples a 2 9 4 7 0 3
Oranges b 9 0 6 0 9 4
Puppies c 2 4 6 1 4 4
Ducks d 6 6 7 1 2 8
df2 = df.rename_axis(index=('foo','bar'), columns=('baz','bak'))
print (df2)
baz A B C
bak X Y X Y X Y
foo bar
Apples a 2 9 4 7 0 3
Oranges b 9 0 6 0 9 4
Puppies c 2 4 6 1 4 4
Ducks d 6 6 7 1 2 8
Removing index and columns names means set it to None
:
df2 = df.rename_axis(index=(None,None), columns=(None,None))
print (df2)
A B C
X Y X Y X Y
Apples a 2 0 2 5 2 0
Oranges b 1 7 5 5 4 8
Puppies c 2 4 6 3 6 5
Ducks d 9 6 3 9 7 0
And @Jeff solution:
df.index.names = ['foo','bar']
df.columns.names = ['baz','bak']
print (df)
baz A B C
bak X Y X Y X Y
foo bar
Apples a 3 4 7 3 3 3
Oranges b 1 2 5 8 1 0
Puppies c 9 6 3 9 6 3
Ducks d 3 2 1 0 1 0
Upvotes: 139
Reputation: 2578
Use df.index.rename('foo', inplace=True)
to set the index name.
Seems this api is available since pandas 0.13.
Upvotes: 22
Reputation: 4914
If you do not want to create a new row but simply put it in the empty cell then use:
df.columns.name = 'foo'
Otherwise use:
df.index.name = 'foo'
Upvotes: 23
Reputation: 13850
df.index.name
should do the trick.
Python has a dir
function that let's you query object attributes. dir(df.index)
was helpful here.
Upvotes: 33
Reputation: 128918
You can just get/set the index via its name
property
In [7]: df.index.name
Out[7]: 'Index Title'
In [8]: df.index.name = 'foo'
In [9]: df.index.name
Out[9]: 'foo'
In [10]: df
Out[10]:
Column 1
foo
Apples 1
Oranges 2
Puppies 3
Ducks 4
Upvotes: 623