Reputation: 1432
I am trying to use Tabula-py to read a pdf. I installed tabula-py through pip install tabula-py
I have also installed the required dependencies
requests
pandas
pytest
flake8
My code is currently as follows:
import tabula
import pandas as pd
df = tabula.read_pdf("report.pdf", pages=2)
print(df)
I am getting the following error:
Traceback (most recent call last):
File "tabula_pdf_reader.py", line 1, in <module>
import tabula
ImportError: No module named tabula
Any inputs to what I am missing here?
Upvotes: 5
Views: 26067
Reputation: 195
I had this problem in Azure ML studio. I solved the issue by changing the python interpreter to Python 3.8 - AzureML (might be different for you, try different ones or maybe the interpreter can be deducted from the path below). Then include the path to the installed packages with:
import sys
sys.path.append('/anaconda/envs/azureml_py38/lib/python3.8/site-packages')
Change the path to whatever you get when you run:
from distutils.sysconfig import get_python_lib
print(get_python_lib())
hopefully it helps someone :)
Upvotes: 0
Reputation: 1
For macOS users - an update to Monterey operating system will solve the problem.
Upvotes: 0
Reputation: 45
I got the same issue here when executing on Terminal. However, after I ran by starting with 'ipython3' instead of 'ipython', it worked perfectly. You have to make sure that tabula-py module is installed in python3 directory, not python2
Upvotes: 2
Reputation: 341
use this
import camelot
tables = camelot.read_pdf('foo.pdf')
tables.export('foo.csv', f='csv', compress=True)
Upvotes: 1
Reputation: 121
I faced this same issue in Ubuntu.
First, check the version of the JDK and JRE that are installed on your machine by running java --version
and javac --version
. Each should have a version greater than 7.
Then use pip3
to install tabula.
Upvotes: 2