Reputation: 191

AttributeError: module 'camelot' has no attribute 'read_pdf'

I am trying to extract tables from pdf using camelot and I get this attribute error. Could you please help?

import camelot
import pandas as pd
pdf = camelot.read_pdf("Gordian.pdf")

AttributeError Traceback (most recent call last) in ----> 1 pdf = camelot.read_pdf("Gordian.pdf")

AttributeError: module 'camelot' has no attribute 'read_pdf'

Upvotes: 19

Answers (9)

Suvorov Oleg

Reputation: 51

I had an error like this. It opened that I was stupid enough to create a test script 'camelot.py' that was trying to import module camelot :-) Renaming helped. The same was for tabula.

Upvotes: 1

Ekkogaming

Reputation: 42

I believe most of the answers here are all correct. However, the confusion lies in the name of the library. There are two Camelot libraries in pycharm.

If you are looking for PDF table extraction, it is called camelot-py NOT Camelot

and then add import camelot.io as camelot

If you confused the two libraries, importing camelot.io will not work.

If it still give you an error. Try removing Camelot from the project interpreter. That worked for me.

Upvotes: 1

JAGJ jdfoxito

Reputation: 777

pip uninstall camelot pip uninstall camelot-py pip install camelot-py[cv]

install ghostscript app from internet

! apt install ghostscript python3-tk pip install ghostscript

Upvotes: 0

Desert Eagle

Reputation: 261

Here's the link with full installation steps: https://camelot-py.readthedocs.io/en/master/user/install.html#using-pip

After you install

pip install camelot-py[cv]

Write this:

import camelot.io as camelot

Upvotes: 2

Joe Gasewicz

Reputation: 1485

I abandoned trying to get camelot to work in Jupiter Notebooks to read tables & instead installed the following:

!{sys.executable} -m pip install tabula-py tabulate

from tabula import read_pdf
from tabulate import tabulate


pdf_path = (
    Path.home()
    / "my_pdf.pdf"
)
df = read_pdf(str(pdf_path), pages=1)
df[0]

Upvotes: 4

vijay chauhan

Reputation: 41

please check if you have java installed on you machine, go to your terminal and run "java -version", if you do not have you won't be able to read pdf using Camelot or tabula,

once you have installed java, install tabula-py using the command pip install tabula-py.

from tabula.io import read_pdf
tables = read_pdf('file.pdf')  # substitute your file name

Upvotes: 4

Chadee Fouad

Reputation: 2948

Try this: import camelot.io as camelot That worked for me.

Upvotes: 10

Umit KOC

Reputation: 474

When downloading the library please pay attention to where it is downloaded. Because the library you downloaded may have been saved in another Python version

Upvotes: -1

waruna k

Reputation: 878

NOTE : If you are using virtual environment activate environment before do this things.

I have already faced this error.There is a no bug in your code.The problem is with camelot installation.

1 remove installed camelot version

2 install again using this command. There is a multiple ways to install camelot. Please try it one by one

pip install camelot-py
pip install camelot-py[cv]
pip install camelot-py[all]

3 run your code >> i have attached sample code here

import camelot

data = camelot.read_pdf("test_file.pdf", pages='all')
print(data)

Upvotes: 18

AttributeError: module &#39;camelot&#39; has no attribute &#39;read_pdf&#39;

Answers (9)

install ghostscript app from internet

Related Questions

AttributeError: module 'camelot' has no attribute 'read_pdf'