Tensorflow: classifying text data

Question

I try to classify text data, where df['Addr'] is X and df['Reg'] is y

                                                    Reg
Addr                                                   
640022, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, УЛ ГО...   45
624214, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ЛЕСНОЙ, РП ...   66
454018, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У...   74
624022, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, СЫСЕРТСКИЙ Р-...   66
454047, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У...   74
456787, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ОЗЕРСК, УЛ Г...   74
450075, РОССИЯ, БАШКОРТОСТАН РЕСП, Г УФА, ПР-КТ...    3
623854, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ИРБИТ, УЛ С...   66
457101, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ТРОИЦК, УЛ С...   74
640008, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, ПР-КТ...   45

I try to use 1 layer tensorflow to classify address, but it returns all 0 instead relevant regions.

I use code

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['Addr'])
X = csr_matrix(X).todense()

X_train, X_test, y_train, y_test = train_test_split(X, df['Reg'].values.reshape(-1, 1), shuffle=True, test_size=0.2)

# tf
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

def random_batch(X_train, y_train, batch_size):
   rnd_indices = np.random.randint(0, X_train.shape[0], batch_size)
   X_batch = X_train[rnd_indices]
   y_batch = y_train[rnd_indices]
   return X_batch, y_batch

reset_graph()

X = tf.placeholder(tf.float32, shape=(None, X_train.shape[1]), name="input")
y = tf.placeholder(tf.float32, shape=(None, y_train.shape[1]), name="y")
y_cls = tf.argmax(y, axis=1)

weights = tf.Variable(tf.truncated_normal([X_train.shape[1], y_train.shape[1]], stddev=0.05), name="weights", trainable=True)
bias = tf.constant(1.0, shape=[y_train.shape[1]], name="bias")

layer_1 = tf.nn.relu_layer(X, weights, bias, name="relu_layer")
outs = tf.nn.softmax(layer_1, name="outs")
y_pred = tf.argmax(outs, axis=1)

cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=layer_1, labels=y)
cost = tf.reduce_mean(cross_entropy)
acc = tf.cast(tf.equal(y_pred, y_cls), tf.float16)
predicted = tf.reduce_sum(acc)

learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(cost)

init = tf.global_variables_initializer()

n_epochs = 100
batch_size = 500
n_batches = int(np.ceil(1000 / batch_size))

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = random_batch(X_train, y_train, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val = cost.eval({X: X_test, y: y_test})
        if epoch % 10 == 0:
            print("Epoch:", epoch, "	Loss:", loss_val)

    y_proba_val = y_pred.eval(feed_dict={X: X_test, y: y_test})

print(y_test.reshape(1, -1))
print(y_proba_val.reshape(1, -1))

Result of this code:

Epoch: 0    Loss: 0.0
Epoch: 10   Loss: 0.0
Epoch: 20   Loss: 0.0
Epoch: 30   Loss: 0.0
...
Epoch: 90   Loss: 0.0
[[ 3 66 66 ... 66 66 66]]
[[0 0 0 ... 0 0 0]]

I can't find an error in my program. I've read that softmax usually use in classifying tasks, but I'm not confident in my actions. Why it returns predictions with 0?

cyniikal · Accepted Answer

I'm pretty sure that your network currently looks like this: (excuse my paint skills)

If you're not going to come up with features for the different addresses on your own, I suggest you add at least one hidden layer so that the network can attempt to create its own features. Currently there's only one weight per connection to tweak, and that's going to result in a VERY weak classifier.

I believe that's the root of the problem, but I'm not entirely sure why your Loss is always 0.0, I'll continue looking, but this is some food for thought.

EDIT: The logits argument supposed to represent the predicted output of the network (distribution of probabilities), so I'd set that to y_pred.

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=outs, labels=y)

Tensorflow: classifying text data

Answers (1)

Related Questions