Camelot-py not detecting two lines of text in one row

Question

Scraping table data from a .PDF using Camelot-py, and it is not picking up stacked lines of text (refer to rows 9 and 10 below)

Rows 9 and 10 are void of text for account.

https://camelot-py.readthedocs.io/en/master/user/advanced.html#specify-table-areas

Here is the code I have in .ipynb format. The first block is for the first table that pulls as expected, the second is for page 9.

Table

tables= camelot.read_pdf(r'C:\PDFFilePath', pages='9', line_scale=40)
    tables[0].to_csv(r'Loans&Leases')
    camelot.plot(tables[0], kind ='contour')
    plt.show()

Using MatPlotLib, I can see that Camelot is correctly detecting the table area/grid for page 9.

plot table area

plot grid

Here is a Google Drive link to the PDF

Call Report PDF

Any insight would be greatly appreciated.

Camelot-py not detecting two lines of text in one row

Table

Answers (1)

Related Questions