Reputation: 311
I have a training set composed by images, from witch I have to predict a label that is formed by a fixed number of letters and numbers. What is the best way to feed these labels into tensorflow?. I thought to create a list of numbers, one for all the chars/numbers in the label. I created a list with alla the possible values:
__dict = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
then I use the index of the letter in the __dict
variable to encode the label into a list of numbers.
For example:
label = abc
label_encoded = [0, 1, 2]
Is it the right way?
Upvotes: 0
Views: 211
Reputation: 1104
The best way is to use a unique integer for each of your labels, i.e aaa=0, aab=1, etc
It is convenient for you to use a positional encoding with base 36 (if letters are lowercase english characters + numbers).
I don't know the problem you are facing, but pay attention to the fact that this could lead to a huge number of labels (i.e. output classes for your classification problem)
Upvotes: 1