Reputation: 111
I am new to ML and thinking about one simple hello world problem as a practice of using ANN. Here is the problem, say I have a training data set with English name and its corresponding gender:
ALEX,M
BONNIE,F
CARLO,F
DAVID,M
EDDY,M
...
I would like to build a model to predict the gender of name. Since the input to ANN must be a form of vector, I am thinking to convert a name into a vector with a number of features same as the longest name in the data set (i.e. 10) and then put A=1, B=2, ... , Z=26, and null=-1 to the vector.
For example:
ALEX will be [1, 12, 5, 24, -1, -1, -1, -1, -1, -1]
The output layer will be {0, 1} which represent either male or female.
It sounds quite strange to me. Is it a good way to feed a single word into ANN like this?
Upvotes: 2
Views: 1340
Reputation: 6759
Use one-hot encoding. This means you have a large input size, but it has proven to work. Using A=1, B=2, Z=26
gives the network the impression that B is closer to A than Z is to A, and it requires a lot of hidden nodes to map the function.
With one-hot encoding:
[1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
[0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
So it requires 26 inputs per letter.
Upvotes: 1