Reputation: 2399
I currently have a dataframe that looks like this:
Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4
0 Sample Number Group Number Sample Name Group Name
1 1.0 1.0 s_1 g_1
2 2.0 1.0 s_2 g_1
3 3.0 1.0 s_3 g_1
4 4.0 2.0 s_4 g_2
I'm looking for a way to delete the header row and make the first row the new header row, so the new dataframe would look like this:
Sample Number Group Number Sample Name Group Name
0 1.0 1.0 s_1 g_1
1 2.0 1.0 s_2 g_1
2 3.0 1.0 s_3 g_1
3 4.0 2.0 s_4 g_2
I've tried stuff along the lines of if 'Unnamed' in df.columns:
then make the dataframe without the header
df.to_csv(newformat, header=False, index=False)
but I don't seem to be getting anywhere.
Upvotes: 216
Views: 440643
Reputation: 1
When reading a file with pandas, this works for me:
pd.read_csv('file_path',header=0)
Upvotes: -1
Reputation: 180
If you are starting with a list of lists
pd.DataFrame(input[1:], columns=input[0])
Upvotes: 2
Reputation: 753
This seems like a task that may be needed more than once. I've taken rgalbo's answer and written a simple function that can be lifted and placed into any project.
def promote_df_headers(df):
'''
Takes a df and uses the first row as the header
Parameters
----------
df : DataFrame
Any df with one or more columns.
Returns
-------
df : DataFrame
Input df with the first row removed and used as the column names.
'''
new_header = df.iloc[0]
df = df[1:]
df.columns = new_header
df = df.reset_index(drop=True)
return df
Upvotes: 0
Reputation: 31
For some reason, I had to do it this way:
df.columns = [*df.iloc[0]]
df = table[1:]
The part where I'm splitting the list into a list looks redundant, but otherwise, the headers still turn up as part of the actual table.
Upvotes: 3
Reputation: 1994
Alternatively, we can do this when reading a file with pandas.
This case we can use,
pd.read_csv('file_path',skiprows=1)
When reading the file this will skip the first row and will set the column as the second row of the file.
Upvotes: 7
Reputation: 339
Another one-liner using Python swapping:
df, df.columns = df[1:] , df.iloc[0]
This won't reset the index
Although, the opposite won't work as expected df.columns, df = df.iloc[0], df[1:]
Upvotes: 21
Reputation: 112
The best practice and Best OneLiner:
df.to_csv(newformat,header=1)
Notice the header value:
Header refer to the Row number(s) to use as the column names. Make no mistake, the row number is not the df but from the excel file(0 is the first row, 1 is the second and so on).
This way, you will get the column name you want and won't have to write additional codes or create new df.
Good thing is, it drops the replaced row.
Upvotes: -4
Reputation: 1
header = table_df.iloc[0]
table_df.drop([0], axis =0, inplace=True)
table_df.reset_index(drop=True)
table_df.columns = header
table_df
Upvotes: 0
Reputation: 2246
Here's a simple trick that defines column indices "in place". Because set_index
sets row indices in place, we can do the same thing for columns by transposing the data frame, setting the index, and transposing it back:
df = df.T.set_index(0).T
Note you may have to change the 0
in set_index(0)
if your rows have a different index already.
Upvotes: 11
Reputation: 809
--another way to do this
df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0)).reset_index(drop=True)
df.columns.name = None
Sample Number Group Number Sample Name Group Name
0 1.0 1.0 s_1 g_1
1 2.0 1.0 s_2 g_1
2 3.0 1.0 s_3 g_1
3 4.0 2.0 s_4 g_2
If you like it hit up arrow. Thanks
Upvotes: 1
Reputation: 335
@ostrokach answer is best. Most likely you would want to keep that throughout any references to the dataframe, thus would benefit from inplace = True.
df.rename(columns=df.iloc[0], inplace = True)
df.drop([0], inplace = True)
Upvotes: 10
Reputation: 8906
The dataframe can be changed by just doing
df.columns = df.iloc[0]
df = df[1:]
Then
df.to_csv(path, index=False)
Should do the trick.
Upvotes: 107
Reputation: 19912
If you want a one-liner, you can do:
df.rename(columns=df.iloc[0]).drop(df.index[0])
Upvotes: 77
Reputation: 4465
new_header = df.iloc[0] #grab the first row for the header
df = df[1:] #take the data less the header row
df.columns = new_header #set the header row as the df header
Upvotes: 363