Reputation: 4442
I try to classify text data, where df['Addr'] is X and df['Reg'] is y
Reg
Addr
640022, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, УЛ ГО... 45
624214, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ЛЕСНОЙ, РП ... 66
454018, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У... 74
624022, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, СЫСЕРТСКИЙ Р-... 66
454047, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У... 74
456787, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ОЗЕРСК, УЛ Г... 74
450075, РОССИЯ, БАШКОРТОСТАН РЕСП, Г УФА, ПР-КТ... 3
623854, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ИРБИТ, УЛ С... 66
457101, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ТРОИЦК, УЛ С... 74
640008, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, ПР-КТ... 45
I try to use 1 layer tensorflow to classify address, but it returns all 0
instead relevant regions.
I use code
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['Addr'])
X = csr_matrix(X).todense()
X_train, X_test, y_train, y_test = train_test_split(X, df['Reg'].values.reshape(-1, 1), shuffle=True, test_size=0.2)
# tf
def reset_graph(seed=42):
tf.reset_default_graph()
tf.set_random_seed(seed)
np.random.seed(seed)
def random_batch(X_train, y_train, batch_size):
rnd_indices = np.random.randint(0, X_train.shape[0], batch_size)
X_batch = X_train[rnd_indices]
y_batch = y_train[rnd_indices]
return X_batch, y_batch
reset_graph()
X = tf.placeholder(tf.float32, shape=(None, X_train.shape[1]), name="input")
y = tf.placeholder(tf.float32, shape=(None, y_train.shape[1]), name="y")
y_cls = tf.argmax(y, axis=1)
weights = tf.Variable(tf.truncated_normal([X_train.shape[1], y_train.shape[1]], stddev=0.05), name="weights", trainable=True)
bias = tf.constant(1.0, shape=[y_train.shape[1]], name="bias")
layer_1 = tf.nn.relu_layer(X, weights, bias, name="relu_layer")
outs = tf.nn.softmax(layer_1, name="outs")
y_pred = tf.argmax(outs, axis=1)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=layer_1, labels=y)
cost = tf.reduce_mean(cross_entropy)
acc = tf.cast(tf.equal(y_pred, y_cls), tf.float16)
predicted = tf.reduce_sum(acc)
learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(cost)
init = tf.global_variables_initializer()
n_epochs = 100
batch_size = 500
n_batches = int(np.ceil(1000 / batch_size))
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = random_batch(X_train, y_train, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
loss_val = cost.eval({X: X_test, y: y_test})
if epoch % 10 == 0:
print("Epoch:", epoch, "\tLoss:", loss_val)
y_proba_val = y_pred.eval(feed_dict={X: X_test, y: y_test})
print(y_test.reshape(1, -1))
print(y_proba_val.reshape(1, -1))
Result of this code:
Epoch: 0 Loss: 0.0
Epoch: 10 Loss: 0.0
Epoch: 20 Loss: 0.0
Epoch: 30 Loss: 0.0
...
Epoch: 90 Loss: 0.0
[[ 3 66 66 ... 66 66 66]]
[[0 0 0 ... 0 0 0]]
I can't find an error in my program.
I've read that softmax
usually use in classifying tasks, but I'm not confident in my actions.
Why it returns predictions with 0
?
Upvotes: 1
Views: 91
Reputation: 134
I'm pretty sure that your network currently looks like this:
(excuse my paint skills)
If you're not going to come up with features for the different addresses on your own, I suggest you add at least one hidden layer so that the network can attempt to create its own features. Currently there's only one weight per connection to tweak, and that's going to result in a VERY weak classifier.
I believe that's the root of the problem, but I'm not entirely sure why your Loss is always 0.0, I'll continue looking, but this is some food for thought.
EDIT: The logits argument supposed to represent the predicted output of the network (distribution of probabilities), so I'd set that to y_pred.
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=outs, labels=y)
Upvotes: 1