Reputation: 2655
I wanted to segment the characters from the background. So far I have been able to detect the image and generate bounding boxes around the image. (see image)
Some people also consider generating the bounding boxes around the text to be segmentation but what I'm looking forward is the segmentation of the characters from the background. (see image, the green part)
I would use this segmentation to remove the Korean text and replace it with its English translation.
Maybe you'd be thinking of using black / white color detection to make this segmentation but it won't necessarily be pure white at the background.
Some Other Examples:
Can you provide any leads to the problem at hand? My end goal is to superimpose the English text on the image such that it would look natural and not weird.
I am open to any suggestions and ideas, if you'd like to come up with a completely different out of the box solution to segment the image that would also be great. All I'm looking to forward to is replacing the korean text with it's English alternative.
Upvotes: 1
Views: 882
Reputation: 928
So you do have already the surrounding boxes, which are wrapping the text right ? (I’ve read in the comments.)
So the remaining questions are
A) extract your Region of Interest (ROI) Be aware of the interesting corner addressing method.
B) separate you letter from the background I would use here a simple binary threshold. The only challenge furthermore to find a global optimal threshold value for all your pictures See the python code for A and B below:
import numpy as np
from matplotlib import pyplot as plt
img = cv.imread('kjgqd.png',0)
img = cv.medianBlur(img,5)
#extract roi with box coorindates
imgROI = img[75:230,103:330]
ret,th1 = cv.threshold(imgROI,127,255,cv.THRESH_BINARY)
titles = ['Original Image', 'Global Thresholding (v = 127)']
images = [imgROI, th1]
for i in range(0,2):
plt.subplot(2,1,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
input:
ROI cut:
ouput:
C) how to get the text from picture to written language ? Welcome to the big world of OCR This the far most complex issue of your task.
I would really recommend you to use a framework like Tesseract A good introduction is here
In the worst case you should start to train your own model to detect your language. I would only recommend this, if your language or font set is not well supported yet by Tesseract. See a howto 1 and 2
For educational reasons and only for that use-case I would recommend you to start with the topic OCR and the famous Mnist example
Here is a good overview over a zoo of possible methods how to get text out of pictures.
D) translate your text
E) put the english translation into the picture use method cv.text like:
import numpy as np
import cv2
#white dummy picture
img = np.zeros((600,200,3), np.uint8)
img.fill(255)#we make it white
# Write some Text
font = cv2.FONT_HERSHEY_SIMPLEX
bottomLeftCornerOfText = (10,500)
fontScale = 1
fontColor = (0,0,0)
lineType = 2
cv2.putText(img,'Hello World!',
bottomLeftCornerOfText,
font,
fontScale,
fontColor,
lineType)
#Display the image
cv2.imshow("img",img)
cv2.waitKey(0)
Upvotes: 1