Reputation: 1
While employing PDFMinerLoader to parse PDF files, I've observed that it introduces additional new lines when encountering bullets or numbers. For example:
Original pdf:
I got:
1.
2.
3.
use the..
replace the..
update the..
Similar issues occur with bullet points, such as: ●
How can I address this problem?
I attempted to switch to an alternative parser method, but it yielded unsatisfactory results, specifically causing text concatenation issues.
Upvotes: 0
Views: 86