msakya
msakya

Reputation: 9801

How to convert index of a pandas dataframe into a column

How to convert an index of a dataframe into a column?

For example:

        gi       ptt_loc
 0  384444683      593  
 1  384444684      594 
 2  384444686      596  

to

    index1    gi       ptt_loc
 0  0     384444683      593  
 1  1     384444684      594 
 2  2     384444686      596  

Upvotes: 854

Views: 1355975

Answers (11)

zkurtz
zkurtz

Reputation: 3280

The simplest way to handle this in a general and safe way:

from pandahandler import indexes

df = indexes.unset(df)

The unset function is a no-op in case the index is already a unnamed range index. Otherwise, it converts the index column(s) to regular columns after first checking that the data frame does not already have a column with the same name as one of the index columns.

Disclosure: I wrote pandahandler.

Upvotes: 0

cottontail
cottontail

Reputation: 23011

To retain the index (that was converted into a column) as index, use a combination of to_frame() and join(). In particular, this doesn't produce a SettingWithCopyWarning, unlike assignment.

df = df.index.to_frame(name='A').join(df)

res1

This works for MultiIndex, too.

df = df.index.to_frame(name=['A', 'B']).join(df)

Also, as Quinten mentions, since pandas 1.5.0, rename_axis + reset_index (or reset_index + rename) syntax have become obsolete. You can directly pass names= as an argument to reset_index(). Even duplicate column names are allowed if allow_duplicates=True is passed (although having duplicate column labels is highly unadvisable).

df = df.reset_index(names=['A', 'B'])

res2

Upvotes: 6

behzad.nouri
behzad.nouri

Reputation: 77941

either:

df['index1'] = df.index

or .reset_index:

df = df.reset_index()

If you have a multi-index frame with 3 levels of index, like:

>>> df
                       val
tick       tag obs        
2016-02-26 C   2    0.0139
2016-02-27 A   2    0.5577
2016-02-28 C   6    0.0303

and you want to convert the 1st (tick) and 3rd (obs) levels in the index into columns, you could do:

>>> df.reset_index(level=['tick', 'obs'])
          tick  obs     val
tag                        
C   2016-02-26    2  0.0139
A   2016-02-27    2  0.5577
C   2016-02-28    6  0.0303

Upvotes: 1389

Francesco Pinna
Francesco Pinna

Reputation: 161

I usually do it this way:

df = df.assign(index1=df.index)

Upvotes: 4

Quinten
Quinten

Reputation: 41225

In the newest version of pandas 1.5.0, you could use the function reset_index with the new argument names to specify a list of names you want to give the index columns. Here is a reproducible example with one index column:

import pandas as pd

df = pd.DataFrame({"gi":[232,66,34,43],"ptt":[342,56,662,123]})

    gi  ptt
0  232  342
1   66   56
2   34  662
3   43  123

df.reset_index(names=['new'])

Output:

   new   gi  ptt
0    0  232  342
1    1   66   56
2    2   34  662
3    3   43  123

This can also easily be applied with MultiIndex. Just create a list of the names you want.

Upvotes: 2

rohetoric
rohetoric

Reputation: 354

This should do the trick (if not multilevel indexing) -

df.reset_index().rename({'index':'index1'}, axis = 'columns')

Code Result

And of course, you can always set inplace = True, if you do not want to assign this to a new variable in the function parameter of rename.

Upvotes: 13

Avneesh Hota
Avneesh Hota

Reputation: 51

df1 = pd.DataFrame({"gi":[232,66,34,43],"ptt":[342,56,662,123]})
p = df1.index.values
df1.insert( 0, column="new",value = p)
df1

    new     gi     ptt
0    0      232    342
1    1      66     56 
2    2      34     662
3    3      43     123

Upvotes: 5

jpp
jpp

Reputation: 164613

rename_axis + reset_index

You can first rename your index to a desired label, then elevate to a series:

df = df.rename_axis('index1').reset_index()

print(df)

   index1         gi  ptt_loc
0       0  384444683      593
1       1  384444684      594
2       2  384444686      596

This works also for MultiIndex dataframes:

print(df)
#                        val
# tick       tag obs        
# 2016-02-26 C   2    0.0139
# 2016-02-27 A   2    0.5577
# 2016-02-28 C   6    0.0303

df = df.rename_axis(['index1', 'index2', 'index3']).reset_index()

print(df)

       index1 index2  index3     val
0  2016-02-26      C       2  0.0139
1  2016-02-27      A       2  0.5577
2  2016-02-28      C       6  0.0303

Upvotes: 76

bunji
bunji

Reputation: 5213

If you want to use the reset_index method and also preserve your existing index you should use:

df.reset_index().set_index('index', drop=False)

or to change it in place:

df.reset_index(inplace=True)
df.set_index('index', drop=False, inplace=True)

For example:

print(df)
          gi  ptt_loc
0  384444683      593
4  384444684      594
9  384444686      596

print(df.reset_index())
   index         gi  ptt_loc
0      0  384444683      593
1      4  384444684      594
2      9  384444686      596

print(df.reset_index().set_index('index', drop=False))
       index         gi  ptt_loc
index
0          0  384444683      593
4          4  384444684      594
9          9  384444686      596

And if you want to get rid of the index label you can do:

df2 = df.reset_index().set_index('index', drop=False)
df2.index.name = None
print(df2)
   index         gi  ptt_loc
0      0  384444683      593
4      4  384444684      594
9      9  384444686      596

Upvotes: 15

Ted Petrou
Ted Petrou

Reputation: 61947

To provide a bit more clarity, let's look at a DataFrame with two levels in its index (a MultiIndex).

index = pd.MultiIndex.from_product([['TX', 'FL', 'CA'], 
                                    ['North', 'South']], 
                                   names=['State', 'Direction'])

df = pd.DataFrame(index=index, 
                  data=np.random.randint(0, 10, (6,4)), 
                  columns=list('abcd'))

enter image description here

The reset_index method, called with the default parameters, converts all index levels to columns and uses a simple RangeIndex as new index.

df.reset_index()

enter image description here

Use the level parameter to control which index levels are converted into columns. If possible, use the level name, which is more explicit. If there are no level names, you can refer to each level by its integer location, which begin at 0 from the outside. You can use a scalar value here or a list of all the indexes you would like to reset.

df.reset_index(level='State') # same as df.reset_index(level=0)

enter image description here

In the rare event that you want to preserve the index and turn the index into a column, you can do the following:

# for a single level
df.assign(State=df.index.get_level_values('State'))

# for all levels
df.assign(**df.index.to_frame())

Upvotes: 56

Apogentus
Apogentus

Reputation: 6613

For MultiIndex you can extract its subindex using

df['si_name'] = R.index.get_level_values('si_name') 

where si_name is the name of the subindex.

Upvotes: 43

Related Questions