Reputation: 191
I am trying to extract tables from pdf using camelot and I get this attribute error. Could you please help?
import camelot
import pandas as pd
pdf = camelot.read_pdf("Gordian.pdf")
AttributeError Traceback (most recent call last) in ----> 1 pdf = camelot.read_pdf("Gordian.pdf")
AttributeError: module 'camelot' has no attribute 'read_pdf'
Upvotes: 19
Views: 48629
Reputation: 51
I had an error like this. It opened that I was stupid enough to create a test script 'camelot.py' that was trying to import module camelot :-) Renaming helped. The same was for tabula.
Upvotes: 1
Reputation: 42
I believe most of the answers here are all correct. However, the confusion lies in the name of the library. There are two Camelot libraries in pycharm.
If you are looking for PDF table extraction, it is called camelot-py
NOT Camelot
and then add import camelot.io as camelot
If you confused the two libraries, importing camelot.io will not work.
If it still give you an error. Try removing Camelot
from the project interpreter. That worked for me.
Upvotes: 1
Reputation: 777
pip uninstall camelot pip uninstall camelot-py pip install camelot-py[cv]
! apt install ghostscript python3-tk pip install ghostscript
Upvotes: 0
Reputation: 261
Here's the link with full installation steps: https://camelot-py.readthedocs.io/en/master/user/install.html#using-pip
After you install
pip install camelot-py[cv]
Write this:
import camelot.io as camelot
Upvotes: 2
Reputation: 1485
I abandoned trying to get camelot to work in Jupiter Notebooks to read tables & instead installed the following:
!{sys.executable} -m pip install tabula-py tabulate
from tabula import read_pdf
from tabulate import tabulate
pdf_path = (
Path.home()
/ "my_pdf.pdf"
)
df = read_pdf(str(pdf_path), pages=1)
df[0]
Upvotes: 4
Reputation: 41
please check if you have java installed on you machine, go to your terminal and run "java -version", if you do not have you won't be able to read pdf using Camelot or tabula,
once you have installed java, install tabula-py using the command
pip install tabula-py
.
from tabula.io import read_pdf
tables = read_pdf('file.pdf') # substitute your file name
Upvotes: 4
Reputation: 474
When downloading the library please pay attention to where it is downloaded. Because the library you downloaded may have been saved in another Python version
Upvotes: -1
Reputation: 878
NOTE : If you are using virtual environment activate environment before do this things.
I have already faced this error.There is a no bug in your code.The problem is with camelot installation.
1 remove installed camelot version
2 install again using this command. There is a multiple ways to install camelot. Please try it one by one
pip install camelot-py
pip install camelot-py[cv]
pip install camelot-py[all]
3 run your code >> i have attached sample code here
import camelot
data = camelot.read_pdf("test_file.pdf", pages='all')
print(data)
Upvotes: 18