NeoNosliw
NeoNosliw

Reputation: 111

How to convert a single word into vector for neural network input

I am new to ML and thinking about one simple hello world problem as a practice of using ANN. Here is the problem, say I have a training data set with English name and its corresponding gender:

ALEX,M
BONNIE,F
CARLO,F
DAVID,M
EDDY,M
...

I would like to build a model to predict the gender of name. Since the input to ANN must be a form of vector, I am thinking to convert a name into a vector with a number of features same as the longest name in the data set (i.e. 10) and then put A=1, B=2, ... , Z=26, and null=-1 to the vector.

For example:

ALEX will be [1, 12, 5, 24, -1, -1, -1, -1, -1, -1] 

The output layer will be {0, 1} which represent either male or female.

It sounds quite strange to me. Is it a good way to feed a single word into ANN like this?

Upvotes: 2

Views: 1340

Answers (1)

Thomas Wagenaar
Thomas Wagenaar

Reputation: 6759

Use one-hot encoding. This means you have a large input size, but it has proven to work. Using A=1, B=2, Z=26 gives the network the impression that B is closer to A than Z is to A, and it requires a lot of hidden nodes to map the function.

With one-hot encoding:

  • A: [1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
  • B: [0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
  • Z: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1]
  • none: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

So it requires 26 inputs per letter.

Upvotes: 1

Related Questions