Reputation: 793
I have a dataframe that looks like so:
123 345 456 789
987 876 765 543
... ... ... ...
But the top row and leftmost column are taken as headers when they are actually values. Is there anyway to shift them down/right and replace them with the default index?
EDIT: I have already considered setting header=None, but it is not an option. The dataframe was created via a read_excel, but many parts of the program already use .loc and such and directly reference the header names that are to be dropped.
Upvotes: 7
Views: 20038
Reputation: 6539
How to insert a row at the top of the original DataFrame in-place
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) # Sample DataFrame
new_row = {'A': 7, 'B': 8} # Create a new row as a dictionary
df.loc[-1] = new_row # Append row to the bottom with index=-1
df.index = df.index + 1 # Shift df index by 1 to maintain index continuity
df.sort_index(inplace=True) # Sort by index to ensure the new row at the top
Upvotes: 0
Reputation: 862511
Use parameter index_col=[0]
, by default first row is converted to columns names, so no parameter for it is necessary:
import pandas as pd
temp=u"""123;345;456;789
987;876;765;543"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep=";", index_col=[0])
print (df)
345 456 789
123
987 876 765 543
If input data is DataFrame with no header:
print (df)
0 1 2 3
0 123 345 456 789
1 987 876 765 543
#set first row to columns
df.columns = df.iloc[0]
#remove first row from data and remove columns name
df = df.iloc[1:].rename_axis(None, axis=1)
#set index by first column
df = df.set_index(df.columns[0])
print (df)
345 456 789
123
987 876 765 543
If same types of values in data is possible use numpy
with indexing:
arr = df.values
df = pd.DataFrame(arr[1:,1:], index=arr[1:,0], columns=arr[0,1:])
df.index.name = arr[0,0]
print (df)
345 456 789
123
987 876 765 543
Upvotes: 3
Reputation: 4607
for your solution, you can just shift it. But if you are reading the data from any csv file, while reading you can take considerations of not taking header(header = None)
345 456 789
123
987 876 765 543
df.reset_index().T.reset_index().T
Out:
0 1 2 3
index 123 345 456 789
0 987 876 765 543
pd.read_csv('data.csv',header=None)
Out:
0 1 2 3
0 123 345 456 789
1 987 876 765 543
Upvotes: 12
Reputation: 2720
There seems to be an issue with the creation of the dataframe. How is the dataframe created? You most likely can solve your issue right with the creation
If that, however, is not an option, try the following:
pandas.DataFrame.reset_index()
is what you want. As for the column names, just add them as a regular row using pandas.DataFrame.append()
with df.columns
as an argument (where df
is your dataframe) and rename the columns after.
Upvotes: 0