Reputation: 377
I need to read multiple tables from a sheet in an Excel file with python. The sheet looks something like this:
I want to get a python object containing the information in First_Table and the same for Second_Table. I tried using pandas and Dataframe.iloc this way:
import pandas as pd
xls = pd.ExcelFile('path_to_xls_file')
df = pd.read_excel(xls, "sheet_1")
# first table
df1 = df.iloc[2:12,0:6]
But I didn't get the expected cells from the First_Table. Am I doing something wrong with the ranges of the rows and columns? Does it have to be specified with the exact row and col indices or is there a more efficient and elegant way to do it?
Thanks in advance!
Upvotes: 7
Views: 6775
Reputation: 863
The approach is right, however might be not optimal. You do not get the table right, because the indexes are incorrect - according to Your screen df1 = df.iloc[1:12,1:6]
should do the job.
Better solution would be setting header and usecols parameters for pd.read_excel()
header : int, list of ints,
default 0 Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex.
Use None if there is no header
usecols : int or list, default None
If None then parse all columns,
If int then indicates last column to be parsed
If list of ints then indicates list of column numbers to be parsed
If string then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides.
Retrieved from: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html
Also, there might be packages designed for reading multiple tables in one sheet, but I am not aware of any.
pandas read_excel multiple tables on the same sheet - duplicate?
Upvotes: 2
Reputation: 2987
Use "usecols" argument to select the columns you want to read from excel file. Pandas will select the rows accordingly.
Also you need to set index to False to avoid getting first column as index.
Following is the example code for your task
pd.read_excel(path, usecols=range(1,6), index=False)
Find more information in documentation
Upvotes: 4