Ori Netanel Ben-Zaken
Ori Netanel Ben-Zaken

Reputation: 377

How to read multiple tables from .xls file in python?

I need to read multiple tables from a sheet in an Excel file with python. The sheet looks something like this: enter image description here

I want to get a python object containing the information in First_Table and the same for Second_Table. I tried using pandas and Dataframe.iloc this way:

import pandas as pd
xls = pd.ExcelFile('path_to_xls_file')
df = pd.read_excel(xls, "sheet_1")
# first table
df1 = df.iloc[2:12,0:6]

But I didn't get the expected cells from the First_Table. Am I doing something wrong with the ranges of the rows and columns? Does it have to be specified with the exact row and col indices or is there a more efficient and elegant way to do it?

Thanks in advance!

Upvotes: 7

Views: 6775

Answers (2)

Sokolokki
Sokolokki

Reputation: 863

The approach is right, however might be not optimal. You do not get the table right, because the indexes are incorrect - according to Your screen df1 = df.iloc[1:12,1:6] should do the job.

Better solution would be setting header and usecols parameters for pd.read_excel()

header : int, list of ints,

default 0 Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex.

Use None if there is no header

usecols : int or list, default None

If None then parse all columns,

If int then indicates last column to be parsed

If list of ints then indicates list of column numbers to be parsed

If string then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides.

Retrieved from: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

Also, there might be packages designed for reading multiple tables in one sheet, but I am not aware of any.

pandas read_excel multiple tables on the same sheet - duplicate?

Upvotes: 2

Keval Dave
Keval Dave

Reputation: 2987

Use "usecols" argument to select the columns you want to read from excel file. Pandas will select the rows accordingly.

Also you need to set index to False to avoid getting first column as index.

Following is the example code for your task

pd.read_excel(path, usecols=range(1,6), index=False)

Find more information in documentation

Upvotes: 4

Related Questions