Reputation:
As I asked in my previous question the problem I'm facing is that I have hundreds of images of handwritten notes. They were written from different people but they are in sequence so you know that for example person1
wrote img1.jpg
-> img100.jpg
. The style of handwriting varies a lot from person to person but there are parts of the notes which are always fixed (maybe that can help an algorithm).
I followed one user suggestion to use tesseract
but it couldn't recognize any of the text. The text is not in engligh but I did use the appropriate language data file.
My knowledge of ai
is limited but from searching and looking at some papers it looks like this could be done with a CNN
. Can someone guide as to what I should do from here? I'd like to go forward with the project but I also don't have a lot of time to learn about neural networks. How challenging is it to implement one that solves this task?
Upvotes: 1
Views: 4549
Reputation: 2582
I wouldn't use tesseract for handwriting recognition. You can train tesseract for handwriting recognition but out of the box it works well for printet text and a lot of fonts and languages.
Here are two links how to train it yourself:
I had best results with Azure Recognition and good with Amazon Recognition: https://aws.amazon.com/en/recognition I would like to have a offline java library for it but didn't found any yet. My next step will be to try ABBYY services because they can also focus on seperated handwritten characters: https://abbyy.technology/en:features:ocr:icr
Update
If somebody find a library or good service even years later I would be happy to see them in the comments.
Upvotes: 3