RubenGeert
RubenGeert

Reputation: 2952

Code creates generator object but I'd like a list instead

I have a number of single sheet Excel files and I'd like to create a list containing sublists. Each sublist is supposed to contain 1) the file name, 2) the sheet name and 3) the row values for each row in the sheet except the first (header) row. I tried the following:

rdir=r"d:\temp"
import xlrd,spss
fils=[fil for fil in os.listdir(rdir) if fil.endswith(".xls")]
allData=[]
for cnt,fil in enumerate(fils):
    wb=xlrd.open_workbook(os.path.join(rdir,fil))
    allData.append([fil]+[wb.sheets()[0].name]+wb.sheets()[0].row_values(row) for row in range(1,wb.sheets()[0].nrows))

This does not give me the list I'm looking for but a list of generator objects instead. What am I doing wrong here and how can I obtain the list I'm looking for?

Upvotes: 1

Views: 259

Answers (3)

RubenGeert
RubenGeert

Reputation: 2952

Thanks everyone for all the input! The list I needed is generated by

rdir=r"d:\temp"
import xlrd,spss
fils=[fil for fil in os.listdir(rdir) if fil.endswith(".xls")]
allData=[]
for cnt,fil in enumerate(fils):
    wb=xlrd.open_workbook(os.path.join(rdir,fil))
    ws=wb.sheets()[0]
    for row in range(1,ws.nrows):
        allData.append([fil]+[ws.name]+[val for val in ws.row_values(row)])

Upvotes: 0

Michael David Watson
Michael David Watson

Reputation: 3071

It looks like your list comprehension is incorrect. It should read:

    allData.append([fil]+[wb.sheets()[0].name]+[wb.sheets()[0].row_values(row) for row in range(1,wb.sheets()[0].nrows)])

Also I would reccomend establishing s1 = wb.sheets()[0] to make the code more legible and slightly faster

rdir=r"d:\temp"
import xlrd,spss
fils=[fil for fil in os.listdir(rdir) if fil.endswith(".xls")]
allData=[]
for cnt,fil in enumerate(fils):
    wb=xlrd.open_workbook(os.path.join(rdir,fil))
    s1 = wb.sheets()[0]
    allData.append([fil]+[s1.name]+[s1.row_values(row) for row in range(1,s1.nrows)])

Also you mentioned in your comment that you wanted to see how to do this with multiple sheets. Assuming your fields in each are the same you could use this:

rdir=r"d:\temp"
import xlrd,spss
fils=[fil for fil in os.listdir(rdir) if fil.endswith(".xls")]
allData=[]
for cnt,fil in enumerate(fils):
    wb=xlrd.open_workbook(os.path.join(rdir,fil))
    for sheet in wb.sheets():
        allData.append([fil]+[sheet.name]+[sheet.row_values(row) for row in range(1,sheet.nrows)])

Upvotes: 2

Shawn Zhang
Shawn Zhang

Reputation: 1884

Not sure what you want , I guess problem is your missing [] on the list comprehension .

allData.append([fil]+[wb.sheets()[0].name]+ [ wb.sheets()[0].row_values(row) for row in range(1,wb.sheets()[0].nrows) ] )

then your final list should be ["Filename","WorksheetName",[row_value list]]

Let's be simple:

>>> list_a = [1,2,3]
>>> list_b = [ ['a','b','c'],['a','b','c'],['a','b','c']]
>>> [list_a + x for x in list_b]
[[1, 2, 3, 'a', 'b', 'c'], [1, 2, 3, 'a', 'b', 'c'], [1, 2, 3, 'a', 'b', 'c']]
>>> [list_a + [ x for x in list_b ] ]
[[1, 2, 3, ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]] 

Upvotes: 2

Related Questions