Bill
Bill

Reputation: 11623

Anaconda did not install packages openpyxl and xlrd

After installing a new Python 3.6 environment with pandas, numpy, etc. when I tried to use the following pandas method I got the following errors:

>>> df.to_excel(filename)
ModuleNotFoundError: No module named 'openpyxl'

Similar issue occurred earlier when I used the pd.read_excel method.

In both cases the problem was solved by installing openpyxl / xlrd with conda install but I would like to know if this is intentional behaviour and why openpyxl/xlrd wouldn't be considered a dependencies of pandas and installed from the beginning.

Upvotes: 3

Views: 6490

Answers (2)

HassanSh__3571619
HassanSh__3571619

Reputation: 2077

FYI, in my case the problem was not solved by Conda install for the missing dependency package, it was solved by pip install...pip install xlrd.

Upvotes: 0

merv
merv

Reputation: 76970

Yes, this is intentional. If you read the Optional Dependencies section of the Pandas documentation, you can see that Excel I/O is included in there.

A couple arguments I can think of for why this is a good thing:

  1. There are so many features incorporated into Pandas that including everything by default would really bloat installs.
  2. There are multiple compatible alternatives for Excel I/O, so it may not be fair to impose a particular choice on people, especially if they already have one installed for another dependency.

However, I do think the error handling here could be improved. For example, it would have been better to provide a message saying that this functionality isn't available without one of the packages, rather than hitting a hard ModuleNotFoundError.

Upvotes: 4

Related Questions