Reputation: 215
I have first 15 rows of a excel file as "Header data". and after 235 rows, "Footer data". I need to read data in between these header and footer data.
Is there any way to read data into DataFrame by selecting specific range of rows using pandas?
Upvotes: 12
Views: 58375
Reputation: 583
You can also do this after loading the file:
df=df[(df.index>15)&(df.index<236)]
df.index-=16
Upvotes: 3
Reputation: 792
So to summarize. Header location is 15 from the top and Footer location is Y from the bottom. Here's how you import the correct values:
import pandas as pd
df=pd.read_excel("File.xls",header=15,skipfooter=_Y_)
Do ensure that your columnar data isn't being excluded!
Upvotes: 6
Reputation: 210972
Demo:
xl = pd.ExcelFile(filepath)
# parsing first (index: 0) sheet
total_rows = xl.book.sheet_by_index(0).nrows
skiprows = 15
nrows = 235 - 15
# calc number of footer rows
# (-1) - for the header row
skipfooter = total_rows - nrows - skiprows - 1
df = xl.parse(0, skiprows=skiprows, skipfooter=skipfooter)
Upvotes: 8
Reputation: 879
You are interested in data from row 15 to row 235.
You can try this:
import pandas as pd
df = pd.read_excel(somefile.xls)
df = df[15:236] #we have to include row 235
Upvotes: 6