Reputation: 609
I have a Excel file with a index that is merged over several rows in Excel, and when I load it in pandas, it reads the first row as the index label, and the rest (the merged cells) is filled with NaNs. How can I loop over the index so that it fills the NaNs with the corresponding index?
EDIT: Image of excel removed by request. I don't have any specific code, but I can write an example.
import pandas as pd
df = pd.read_excel('myexcelfile.xlsx', header=1)
df.head()
Index-header Month
0 Index1 1
1 NaN 2
2 NaN 3
3 NaN 4
4 NaN 5
5 Index2 1
6 NaN 2
...
Upvotes: 3
Views: 399
Reputation: 294488
from StringIO import StringIO
import pandas as pd
txt = """Index1,1
,2
,3
Index2,1
,2
,3"""
df = pd.read_csv(StringIO(txt), header=None, index_col=0, names=['Month'])
df
df.set_index(df.index.to_series().ffill(), inplace=True)
df
Upvotes: 2
Reputation: 210902
Try this:
In [205]: df
Out[205]:
Index-header Month
0 Index1 1.0
1 NaN 2.0
2 NaN 3.0
3 NaN 4.0
4 NaN 5.0
5 Index2 1.0
6 NaN 2.0
... NaN NaN
In [206]: df['Index-header'] = df['Index-header'].fillna(method='pad')
In [207]: df
Out[207]:
Index-header Month
0 Index1 1.0
1 Index1 2.0
2 Index1 3.0
3 Index1 4.0
4 Index1 5.0
5 Index2 1.0
6 Index2 2.0
... Index2 NaN
Upvotes: 4