dirac_delta
dirac_delta

Reputation: 113

Extracting text from scanned engineering drawings

I'm trying to extract text from a scanned technical drawing. For confidentiality reasons, I cannot post the actual drawing, but it looks similar to this, but a lot busier with more text within shapes. The problem is quite complex due to issues with letters touching both each other and it's surrounding borders / symbols.

I found an interesting paper that does exactly this called "Detection of Text Regions From Digital Engineering Drawings" by Zhaoyang Lu. It's behind a paywall so you might not be able to access it, but essentially it tries to erase everything that's not text from the image through mainly two steps:

1) Erases linear components, including long and short isolated lines

2) Erases non-text strokes in terms of analysis of connected components of strokes

What kind of OpenCV functions would help in performing these operations? I would rather not write something from the ground up to do these, but I suspect I might have to.

I've tried using a template-based approach to try to isolate the text, but since the text location isn't completely normalized between drawings (even in the same project), it fails in detecting text past the first scanned figure.

Upvotes: 5

Views: 3149

Answers (3)

pluto2111
pluto2111

Reputation: 1

I am working on a similar problem, except my drawings are not as complex. Here is what worked for me

import keras_ocr

image = "pid.PNG"

# Load Keras OCR model (detector and recognizer)
pipeline = keras_ocr.pipeline.Pipeline()

img_array = keras_ocr.tools.read(image)

# Perform text detection
detected_texts = pipeline.recognize([img_array])

# Print the detected texts for each page
for d in detected_texts[0]:
    print(d[0])  # d[0] contains the detected text

Upvotes: 0

Optavius
Optavius

Reputation: 325

I am working on a similar problem. Technical drawings are an issue because OCR software mostly tries to find text baselines and the drawing artifacts (lines etc) get in the way of that approach. In the drawing you specified there are not many characters touching each other. So I suggest to break the image into contiguous (black) pixels and then scan those individually. The height of the contiguous areas should give you also an indication if the contiguous area is text, or a piece of the drawing. To break the image into contiguous pixels, use a flood fill algorithm, and for the scanning Tesseract does a good job.

Upvotes: 1

Á. Márton
Á. Márton

Reputation: 547

Obviously I've never attempted this specific task, however if the image really looks like the one you showed me I would start by removing all vertical and horizontal lines. This could be done pretty easily, just set a width threshold and for all pixels with intensity larger than some N value, and after that look the threshold amount of pixels perpendicular to the hypothethic line orientation. If it looks like a line erase it.

More elegant and perhaps better would be to do a hough transform for lines and circles and remove those elements that way.

Also you could maybe try some FFT based filtering, but I'm not so sure about that.

I've never used OpenCV but i would guess it can do the things i mentioned.

Upvotes: 0

Related Questions