Panda Bear
Panda Bear

Reputation: 11

ModuleNotFoundError: No module named 'haystack.nodes'

I am following the tutorial from haystacks website for Extractive QA system. I am trying to convert PDF to Text. Link to the blog is here : (https://www.deepset.ai/blog/automating-information-extraction-with-question-answering)

I pip installed haystack but I get this error. I even tried !pip install haystack.nodes but that doesn't work.

Note: I am using Google Colab for this.

Here is my detailed code and error:

!pip -q install haystack haystack.nodes
path = '/content/drive/MyDrive/Colab Notebooks/NLP/Information Extraction QA with Haystack (Adidas Financial corpus)'
from haystack.nodes import PDFToTextConverter

pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])

converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-7-61021fb3b7b8> in <cell line: 1>()
----> 1 from haystack.nodes import PDFToTextConverter
      2 
      3 pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
      4 
      5 converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })

Upvotes: 1

Views: 6457

Answers (4)

Lucifer
Lucifer

Reputation: 147

Preface

I had the same issue, I had tried all the above solutions. Tried

  • Installing the farm-haystack and uninstalling haystack-ai
  • making sure only one of them was installed.
  • making sure it was using the same pip venv
  • and many other solutions.

Solution

It was very simple in my case but this did work. Turns out I was using Jupyter code blocks to install haystack in Visual studio code and I just had to restart my Editor 🫠🤦. Also just to make sure, I installed and re installed the haystack-ai python library.

  1. Open Terminal or from where ever you installed the haystack package.
  2. Now, Open Terminal, type pip uninstall farm-haystack haystack-ai farm-haystack -y
  3. then type pip install haystack-ai
  4. type python, and try import haystack or any other submodule.

Voila 🎆, its done.

Upvotes: 0

Joy
Joy

Reputation: 97

Note that installing farm-haystack and haystack-ai in the same Python environment (virtualenv, Colab, or system) causes problems. In my case, I had to enable the Telemetry environment. These steps solved the problem for me:

!pip uninstall farm-haystack haystack-ai farm-haystack
!pip install --upgrade pip
!pip install farm-haystack[colab,ocr,preprocessing,file-conversion,pdf]

Then, I enabled the "Telemetry" environment by adding these lines at the top of my script:

from haystack.telemetry import tutorial_running
tutorial_running(8)

Upvotes: 0

Julian Risch
Julian Risch

Reputation: 271

To install Haystack, you need to run pip install farm-haystack. The pypi package is called farm-haystack and not just haystack as Stefano mentioned.

A good starting point are the Haystack tutorials, which you can run as python notebooks on Google Colab, for example this tutorial using the PDFToTextConverter.

Upvotes: 1

Kilian Kramer
Kilian Kramer

Reputation: 109

Do not name any of your files haystack.py otherwise you will get import fails. This goes for all projects, never name any file like the library itself. ;-)

Upvotes: 0

Related Questions