GIS87
GIS87

Reputation: 111

Predicting values that are not the same shape as the training data that the model fit to

I'm trying to train an Deep Neural Network to be able to classify a string based on its value. So, this means that my data is all text. However, it's not text in the sense of a sentence, which is what most text classification threads I've seen talk about on the Internet. For the algorithm to work, I one-hot encoded the inputs (although these are not categorical values, so I'm not sure if there's more correct way to encoding them) and trained the model. However, the problem is that when I try to run a new text string that the algorithm has not seen in the test or training dataset, the algorithm is expecting the input to be the shape of the one-hot encoded training dataset. How are we supposed to train a model and then change the inputs so it will accept an actual string that is not necessarily the same shape as what the model was fit to?

Here is an example of the training data:

SB-01_0-1_20200701    1
11-22-4334            0
MW-01_20200621        1
Benzene               0

To illustrate the issue, here is the code of the model itself:

DNNmodel = keras.Sequential([
keras.layers.Dense(1),  #input layer size
keras.layers.Dense(64, activation='relu'),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(1)   #output layer size
])

DNNmodel.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

DNNmodel.fit(x_train, y_train, epochs=3, batch_size=32)

And when I try to run:

 DNNmodel.predict(np.array(["RI-SB-01_0-5_20200102"]))

to try to classify a single string value, I get the value error of "ValueError: Input 0 of layer sequential_21 is incompatible with the layer: expected axis -1 of input shape to have value 10509 but received input with shape [None, 1]"

Any tips on how this is done?

Upvotes: 0

Views: 672

Answers (1)

Gaslight Deceive Subvert
Gaslight Deceive Subvert

Reputation: 20438

You cannot do that. Shapes of all inputs and all outputs to and from your network must always be the same.

To get around that limitation, you should encode your strings to fixed-size vectors. Like this, if you want 20-dimensional vectors:

X = ['SB-01_0-1_20200701', '11-22-4334', 'MW-01_20200621', 'Benzene']
X = [[ord(c) for c in x] for x in X]
X = [x + [0] * (20 - len(x)) for x in X]
X = np.array(X)

Your network should be changed accordingly:

DNNmodel = keras.Sequential([
    keras.layers.Dense(20),  #input layer size
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(1)   #output layer size
])

Then when predicting you must encode the input in the same way you encoded the training data.

Upvotes: 1

Related Questions