Reputation: 21
I currently have a dataframe that looks like this:
| Unnamed:1|Unnamed:2 |Unnamed:3 |
| Column 1 | Column 2 | Column 3 |
0 | Data | Data | Data |
1 | Data | Data | Data |
2 | Data | Data | Data |
3 | Data | Data | Data |
I've been using the following code unsuccessfully as it replaces the header with the index [0] row:
new_header = df.iloc[0]
df = df[1:]
df.columns = new_header
Ideally the new dataframe would look like:
| Column 1 | Column 2 | Column 3 |
0 | Data | Data | Data |
1 | Data | Data | Data |
2 | Data | Data | Data |
3 | Data | Data | Data |
Upvotes: 0
Views: 444
Reputation: 13407
It seems that your data is being read in as having a pd.MultiIndex
column. This means that functionally each of your column labels is actually a tuple of values:
[("Unnamed:1", "Column 1"), ("Unnamed:2", "Column 2"), ...]
To fix this you can drop the 0th level of your columns using df.droplevel(0, axis=1)
method. This will drop the first element of each of those aforementioned tuples
. We specify axis=1
because we want to drop the 0th level of the column index, and not the row index.
Snippet to try:
df = df.droplevel(0, axis=1)
Reproducible snippet:
columns = pd.MultiIndex.from_arrays(
[["Unnamed:1", "Unnamed:2", "Unnamed:3"],
["Column 1", "Column 2", "Column 3"]]
)
data = [[1,2,3],[4,5,6],[7,8,9]]
df = pd.DataFrame(data, columns=columns)
print(df)
Unnamed:1 Unnamed:2 Unnamed:3
Column 1 Column 2 Column 3
0 1 2 3
1 4 5 6
2 7 8 9
Now using .droplevel(...)
:
df = df.droplevel(0, axis=1)
print(df)
Column 1 Column 2 Column 3
0 1 2 3
1 4 5 6
2 7 8 9
Upvotes: 1
Reputation: 4209
You are dealing with Multiindex columns. If you want to keep just the second header line, this would be enough: df.columns = df.columns.get_level_values(1)
Upvotes: 1