Reputation: 73
I have an excel sheet which has multiple tables in it, using openpyxl .tables method to read the tables but getting empty list eventhough there are two tables but it return empty. Is there a way to achieve it in python. I need to further process the data from these tables after extracting it into a dataframe. But the tables itself is not getting detected. Any pointers on this would be helpful.
from openpyxl import load_workbook import pandas as pd
#read file wb = load_workbook('29.xlsx')
#access specific sheet ws = wb["Sheet1"]
print(ws.tables.items())
Below is the structure of the Excel sheet.
Upvotes: 0
Views: 1327
Reputation: 73
Parsing the dataframe read, able to get only the tables as ouptut as given below, is there a better way to handle this, so it works for other excel files of similar kind with multiple tables in it.
from operator import index
import xlrd
import pandas
import math
df = pandas.read_excel('29.xlsx', engine='openpyxl',index_col=None)
noofColumsn = df.shape[1]
a_list = []
for i in df.itertuples():
j = 0
for x in i:
if(pandas.isna(x)):
j = j + 1
if(j == (noofColumsn -1)):
break
if(j < (noofColumsn -1)):
list(i)
print(i)
a_list.append(i)
df1 = pandas.DataFrame(a_list)
del df1[df1.columns[0]]
print(df1.head(1))
df1.to_excel("output.xlsx",header=None,index = False)
Upvotes: -1