user15047438
user15047438

Reputation:

Cannot read all sheets of the excel file using pandas

This program works well for reading 1st sheet of the excel file, while I am trying to read all the sheets of the excel file I cannot read all the sheets.

python

while I add sheet_name = None in the reading excel line I am getting error like this I can not find out what is the error and how to rectify the error.

(scrap-3-HFZx_P-py3.9) PS F:\mohan> & C:/Users/hp/AppData/Local/pypoetry/Cache/virtualenvs/scrap-3-HFZx_P-py3.9/Scripts/python.exe f:/mohan/main.py
ENTER THE LIST HERE : userid,city,state
enter the full path to the file : F:\\mis\\KB.xlsx
Traceback (most recent call last):
  File "f:\mohan\main.py", line 16, in <module>     
    print(obj.extract(file_name))
  File "f:\mohan\main.py", line 9, in extract       
    return raw_excel[conf].to_dict(orient='records')
TypeError: unhashable type: 'list'

Upvotes: 1

Views: 833

Answers (2)

jezrael
jezrael

Reputation: 862431

Solution:

#output is nested lists of list of dictionaries
def extract(self, file_name):
    raw_excel=pd.read_excel(file_name,sheet_name=None)
    return [v[v.columns.intersection(conf)].to_dict(orient='records')
                  for k, v in raw_excel.items()]

Explanation:

If use sheet_name=None output is dictionary of DataFrames, here raw_excel.

If need loop by dict here is used list comprehension with method items, so v is values and k for keys.

For filter only columns from DataFrame if exist in conf is used Index.intersection.

Last is used to_dict, so get for each DataFrame list of dictionaries. Final output, in another words return get lists of list of dictionaries.

If need flatten ouput is possible use this solution:

flat_list = [item for sublist in t for item in sublist]

So code is changed:

#output is flatten list of dictionaries
def extract(self, file_name):
    raw_excel=pd.read_excel(file_name,sheet_name=None)
    return [x for k, v in raw_excel.items() 
               for x in v[v.columns.intersection(conf)].to_dict(orient='records')]

Upvotes: 1

asifzuba
asifzuba

Reputation: 470

  • This is a bit quirky, but when you do sheet_name=None, the returned type is not a pandas DataFrame but a dict. I personally don't like when return type changes with different arguments but it is what it is

  • However, this solves the mystery of the type error. Because you are now getting a dict back, and so when you subset with a list ...it can't do that operation and will give you a Unhashable Type error.

  • I would suggest that you have a conditional which checks for type and then does some extra processing.

  • Let me know if you need an elaborate example for type checking.

Upvotes: 0

Related Questions