Reputation: 1299
I have an xlsx sheet with sample data as shown below.
fileinfo: sample file | DateCreated: 09/23/2013 | Country: Japan
Num. | Name | Age | Sex | Country
1 | Mark | 45 | M | Australia
2 | Steve| 50 | M | United kingdom
3 | Julia| 35 | F | USA
fileinfo: sample file | DateCreated: 09/23/2013 | Country: Japan
Num. | Name | Age | Sex | Country
1 | Ronald | 64 | M | USA
2 | Micheal| 52 | M | China
3 | Zed | 35 | F | USA
The file contains the fileinfo row anywhere in the file and how many ever times in the file. I need to make sure that i don't capture them in my code.
Is there a way in xlrd that i can capture just the headers and the data into a dictionary leaving alone the text data which is file info rows here in this example.
Here's the code that i have till now.
import xlrd
def importXLSX(fileName):
with xlrd.open_workbook(fileName) as wb:
worksheet = wb.sheet_by_index(0)
total_rows = worksheet.nrows
num_rows, curr_row = worksheet.nrows, 0
keys = [x.value for x in worksheet.row(1)]
data = dict((x, []) for x in keys)
while curr_row < num_rows:
curr_row += 1
for i, val in enumerate(worksheet.row(curr_row+1)):
data[keys[i]].append(val)
return data
data = importXLSX('simple.xlsx')
print data
Upvotes: 0
Views: 463
Reputation: 1299
I finally found a way to do it. XLRD returns a zero when a cell is empty. So i was able to compare the rows with empty columns and figure out which are header rows and add them to a list and finally iterate and print them.
Upvotes: 1