patrick
patrick

Reputation: 367

how to use fred's imagemagick textcleaner script in opencv c++/opencv java?

I'm trying to develop an app that can read text from image. I have to clean the background of image. I heard that fred's imagemagick textcleaner script can be use but i don't know how to use it. Does anyone has any idea about it?

Input Image :

enter image description here

Upvotes: 2

Views: 1494

Answers (2)

Mark Setchell
Mark Setchell

Reputation: 207748

I had a try at this and while the news is not good, it's still an answer, even if negative. Maybe someone else wants to take my efforts further, or maybe you feel my efforts confirm that textcleaner is not the way to go. Anyway, I took your image and wrote a script to vary the most promising parameters of Fred Weinhaus's textcleaner. I feel that the ones that may help are -f, -o and -t, and I varied these through their likely ranges like this:

#!/bin/bash
for f in 1 5 10 15 20 25; do
   for o in 1 3 6 9 12; do
      for t in 1 25 50 75 100; do
         ./textcleaner -f $f -o $o -t $t cc.jpg z_${f}_${o}_${t}.png
         convert -label "f=$f, o=$o, t=$t" z_${f}_${o}_${t}.png miff:-
      done
   done
done | montage - -frame 5 -tile 6x montage.png

That gives me this montage of all the results

enter image description here

To my eye, the most promising was maybe f=10, o=1, t=1

enter image description here

I then thought "why bother seeing what I like, let's see what Tesseract likes?". So I changed the script to this so that Tesseract got to look at all the permutations:

#!/bin/bash
for f in 1 5 10 15 20 25; do
   for o in 1 3 6 9 12; do
      for t in 1 25 50 75 100; do
         ./textcleaner -f $f -o $o -t $t cc.jpg z_${f}_${o}_${t}.png
         tesseract z_${f}_${o}_${t}.png res > /dev/null 2>&1
         if grep "[0-9]" res* ; then echo z_${f}_${o}_${t}.png ;fi
      done
   done
done

And the results were abysmal... here is the output

um 0-" V _
L"“1}- H
z_5_3_50.png
:1:J£‘u  “
z_15_3_75.png
”':{E]!)  /3: '55‘
z_15_6_75.png
 E2?
z_15_9_1.png
:1:
z_15_12_100.png
I -.352}:  "H ,1 5
z_20_12_25.png
1/
, ,5». 3».
z_25_6_75.png
 3
z_25_9_25.png
 - ::'§—:am I-:L’5‘:*‘f§~f.’i'7""“-‘-"I 5="
z_25_12_1.png
7 3:2‘
z_25_12_75.png

Nothing even remotely useful. Maybe someone else has a better idea about how to tune it and which parameters to tweak, but I suspect that textcleaner may be the wrong approach here.

Upvotes: 8

jnovacho
jnovacho

Reputation: 2903

Without seeing your data first it's hard to guess. If you have fairly uniform background you can use adaptive thresholding to remove the background.

Here are some theoretical informations on how to use adaptive thresholding. This algorithm is implemented in OpenCV.

Upvotes: 0

Related Questions