Reputation: 121
I would like to know how to crawl data inside a pdf file using scrapy. Which module should I use and which is the best and effective way?? Could you please give me some sample tutorials on this
Thanks!!
Upvotes: 4
Views: 6616
Reputation: 3691
I suggest you get the PDF with Scrapy and use PyPDF2 to get the content inside the PDF.
For a complete but somewhat old (using pyPDF) example take a look at this site.