Reputation: 21
I have different type of invoice files, I want to find table in each invoice file. I am able to convert scanned pdf to image by using 'pdf2jpg' method now i have to extract table from each invoices and write into csv file by using OCR pytesseract method. Please help.
Upvotes: 1
Views: 6623
Reputation: 1329
Perhaps this code will help you:
import pyautogui
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
text = pytesseract.image_to_string('c:\\screenshot\\test.png')
f = open('c:\\screenshot\\csvfile_1.csv','w')
f.write(text)
f.close()
Upvotes: 1