Reputation: 333
I'm building an Android application with OCR and Tensorflow. It scans price tags in supermarkets and has to put the scanned data into different fields. I've done the OCR part, so the image -> text recognition works fine and Tensorflow is only required to work with text input.
I'm new to Tensorflow and machine learning in overall. Is it possible to do the following work using Tensorflow and if yes, could you share some ideas on how to do so?
The average input looks like this:
CARLSBERG
EESTI
HELE OLU 5%
1.59 +0.10
500 ml pudel
3.18 /I
4740019113419
The goal is to sort this data as follows:
Brand: CARLSBERG
Product name: HELE OLU 5%
Size: 500
Units: ml
The parameters that determine how a particular string will be classified are:
Upvotes: 3
Views: 1130
Reputation: 445
As mentioned in 4d11's answer, one of the biggest challenges in machine learning is often getting a high quality, significantly sized set of training data.
In terms of feeding data into a Tensorflow network/model, I'd recommend you check out their 'get started' tutorial on feature columns: https://www.tensorflow.org/get_started/feature_columns
Feature columns are used to represent data of different types numerically for a representation that can be fed into the model. The tutorial goes into some detail on the ways in which this works and why you may choose to represent different data in different ways. I found it pretty helpful as an intro.
Upvotes: 1
Reputation: 317
I think the first step would be to get your hands on or to generate some labelled training data. You should look into at feature extraction; for example, if you notice that for a certain item, the second line is usually the price, you could represent that as a parameter. Or say if a number is followed by a unit like ml/l/oz, it's likely to be the volume. What you want to know is how confident you are that a specific line/string is say the price.
However, I think TensorFlow would be more suited for the OCR portion of the problem, which you have already solved. What you are asking is more towards text parsing, which could be better solved with an NLP approach.
Upvotes: 1
Reputation: 1382
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine. https://github.com/emedvedev/attention-ocr
Upvotes: 0