Rushat Rai
Rushat Rai

Reputation: 793

Turn the column headers into the first row and row headers into the first column in Pandas dataframe

I have a dataframe that looks like so:

123  345   456   789
987  876   765   543
...  ...   ...   ...

But the top row and leftmost column are taken as headers when they are actually values. Is there anyway to shift them down/right and replace them with the default index?

EDIT: I have already considered setting header=None, but it is not an option. The dataframe was created via a read_excel, but many parts of the program already use .loc and such and directly reference the header names that are to be dropped.

Upvotes: 7

Views: 20038

Answers (4)

pymen
pymen

Reputation: 6539

How to insert a row at the top of the original DataFrame in-place

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})  # Sample DataFrame


new_row = {'A': 7, 'B': 8} # Create a new row as a dictionary
df.loc[-1] = new_row  # Append row to the bottom with index=-1

df.index = df.index + 1  # Shift df index by 1 to maintain index continuity
df.sort_index(inplace=True)  # Sort by index to ensure the new row at the top

Upvotes: 0

jezrael
jezrael

Reputation: 862511

Use parameter index_col=[0], by default first row is converted to columns names, so no parameter for it is necessary:

import pandas as pd

temp=u"""123;345;456;789
987;876;765;543"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep=";", index_col=[0])

print (df)
     345  456  789
123               
987  876  765  543

If input data is DataFrame with no header:

print (df)
     0    1    2    3
0  123  345  456  789
1  987  876  765  543

#set first row to columns
df.columns = df.iloc[0]
#remove first row from data and remove columns name
df = df.iloc[1:].rename_axis(None, axis=1)
#set index by first column
df = df.set_index(df.columns[0])
print (df)
     345  456  789
123               
987  876  765  543

If same types of values in data is possible use numpy with indexing:

arr = df.values
df = pd.DataFrame(arr[1:,1:], index=arr[1:,0], columns=arr[0,1:])
df.index.name = arr[0,0]
print (df)
     345  456  789
123               
987  876  765  543

Upvotes: 3

Naga kiran
Naga kiran

Reputation: 4607

for your solution, you can just shift it. But if you are reading the data from any csv file, while reading you can take considerations of not taking header(header = None)

    345 456 789
123         
987 876 765 543

df.reset_index().T.reset_index().T

Out:

         0  1   2   3
index   123 345 456 789
0       987 876 765 543


pd.read_csv('data.csv',header=None)

Out:

    0   1   2   3
0   123 345 456 789
1   987 876 765 543

Upvotes: 12

Lukas Thaler
Lukas Thaler

Reputation: 2720

There seems to be an issue with the creation of the dataframe. How is the dataframe created? You most likely can solve your issue right with the creation

If that, however, is not an option, try the following: pandas.DataFrame.reset_index() is what you want. As for the column names, just add them as a regular row using pandas.DataFrame.append() with df.columns as an argument (where df is your dataframe) and rename the columns after.

Upvotes: 0

Related Questions