Reputation: 113
I'm trying to extract text from a scanned technical drawing. For confidentiality reasons, I cannot post the actual drawing, but it looks similar to this, but a lot busier with more text within shapes. The problem is quite complex due to issues with letters touching both each other and it's surrounding borders / symbols.
I found an interesting paper that does exactly this called "Detection of Text Regions From Digital Engineering Drawings" by Zhaoyang Lu. It's behind a paywall so you might not be able to access it, but essentially it tries to erase everything that's not text from the image through mainly two steps:
1) Erases linear components, including long and short isolated lines
2) Erases non-text strokes in terms of analysis of connected components of strokes
What kind of OpenCV functions would help in performing these operations? I would rather not write something from the ground up to do these, but I suspect I might have to.
I've tried using a template-based approach to try to isolate the text, but since the text location isn't completely normalized between drawings (even in the same project), it fails in detecting text past the first scanned figure.
Upvotes: 5
Views: 3149
Reputation: 1
I am working on a similar problem, except my drawings are not as complex. Here is what worked for me
import keras_ocr
image = "pid.PNG"
# Load Keras OCR model (detector and recognizer)
pipeline = keras_ocr.pipeline.Pipeline()
img_array = keras_ocr.tools.read(image)
# Perform text detection
detected_texts = pipeline.recognize([img_array])
# Print the detected texts for each page
for d in detected_texts[0]:
print(d[0]) # d[0] contains the detected text
Upvotes: 0
Reputation: 325
I am working on a similar problem. Technical drawings are an issue because OCR software mostly tries to find text baselines and the drawing artifacts (lines etc) get in the way of that approach. In the drawing you specified there are not many characters touching each other. So I suggest to break the image into contiguous (black) pixels and then scan those individually. The height of the contiguous areas should give you also an indication if the contiguous area is text, or a piece of the drawing. To break the image into contiguous pixels, use a flood fill algorithm, and for the scanning Tesseract does a good job.
Upvotes: 1
Reputation: 547
Obviously I've never attempted this specific task, however if the image really looks like the one you showed me I would start by removing all vertical and horizontal lines. This could be done pretty easily, just set a width threshold and for all pixels with intensity larger than some N value, and after that look the threshold amount of pixels perpendicular to the hypothethic line orientation. If it looks like a line erase it.
More elegant and perhaps better would be to do a hough transform for lines and circles and remove those elements that way.
Also you could maybe try some FFT based filtering, but I'm not so sure about that.
I've never used OpenCV but i would guess it can do the things i mentioned.
Upvotes: 0