lali
lali

Reputation: 1

pdf miner adding extra new lines

While employing PDFMinerLoader to parse PDF files, I've observed that it introduces additional new lines when encountering bullets or numbers. For example:

Original pdf:

  1. use the...
  2. replace the..
  3. update the..

I got:

1.
2.
3.
use the..
replace the..
update the..

Similar issues occur with bullet points, such as: ●

How can I address this problem?

I attempted to switch to an alternative parser method, but it yielded unsatisfactory results, specifically causing text concatenation issues.

Upvotes: 0

Views: 86

Answers (0)

Related Questions