How to extract characters from images where their positions are known?

Question

I have a set of png images of 300dpi . Each image is full of text (not handwritten), digits (not handwritten).

l want to extract each character and save it in a different image. For each character in the image l have its position stored in csv file.

For instance in image1.png for a given character “k” l have its position :

 “k”=[left=656, right=736,top=144,down= 286]

Is there any python library which allows to do that ?. As input l have the images (png format) and csv file that contains the position of each character of each images. after executing the code l stack at this line :

img_charac=img[int(coords[2]):int(coords[3]),int(coords[0]):int(coords[1])]

l got the following error:

Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'NoneType' object has no attribute '__getitem__'

Soltius · Accepted Answer

So if I understood correctly, this has nothing to do with image processing, just file opening, image cropping and saving. With a csv file looking like ,
an input image looking like

I get results like

import cv2
import numpy as np
import csv

path_csv= #path to your csv

#stock coordinates of characters from your csv in numpy array
npa=np.genfromtxt(path_csv+"cs.csv", delimiter=',',skip_header=1,usecols=(1,2,3,4))
nb_charac=len(npa[:, 0]) #number of characters

#stock the actual letters of your csv in an array
characs=[]
cpt=0
#take characters
f = open(path_csv+"cs.csv", 'rt')
reader = csv.reader(f)
for row in reader:
    if cpt>=1: #skip header
        characs.append(str(row[0]))
    cpt+=1

#open your image
path_image= #path to your image
img=cv2.imread(path_image+"yourimagename.png")
path_save= #path you want to save to

#for every line on your csv,
for i in range(nb_charac):
    #get coordinates
    coords=npa[i,:]
    charac=characs[i]

    #actual cropping of the image (easy with numpy)
    img_charac=img[int(coords[2]):int(coords[3]),int(coords[0]):int(coords[1])]
    #saving the image
    cv2.imwrite(path_save+"carac"+str(i)+"_"+str(charac)+".png",img_charac)

This is sort of quick and dirty, the csv opening is a bit messy for example (you could get all the info with one opening and converting), and should be adapted to your csv file anyway.

How to extract characters from images where their positions are known?

Answers (1)

Related Questions