Reputation: 189
I am using tesseract for OCR. I am on ubuntu 18.04.
I have this program which extracts the texts from an image and print it. I want that program to create a new text file and paste the extracted content on to the new text file, but I am only able to do these
Here is my program which extracts the text from image
from pytesseract import image_to_string
from PIL import Image
print image_to_string(Image.open('sample.jpg'))
Here is the program which copies the text to clipboard,
import os
def addToClipBoard(text):
command = 'echo ' + text.strip() + '| clip'
os.system(command)
This program will open the geditor and create a new text file
import subprocess
proc = subprocess.Popen(['gedit', 'file.txt'])
Any help would be appreciated.
Upvotes: 0
Views: 4546
Reputation: 57075
Just as I proposed in the comment, create a new file and write the extracted text into it:
with open('file.txt', 'w') as outfile:
outfile.write(image_to_string(Image.open('sample.jpg')))
Upvotes: 1
Reputation: 4792
If you just want the text, then open a text file and write to it:
from pytesseract import image_to_string
from PIL import Image
text = image_to_string(Image.open('sample.jpg'))
with open('file.txt', mode = 'w') as f:
f.write(text)
Upvotes: 2