DocDriven
DocDriven

Reputation: 3974

Meaning and dimensions of tf.contrib.learn.DNNClassifier's extracted weights and biases

I relatively new to tensorflow, but even with a lot of research I was unable to find a documentation of certain variable meanings.

For my current project, I want to train a DNN with the help of tensorflow, and afterwards I want to extract the weight and bias matrices from it to use it in another application OUTSIDE tensorflow. For the first try, I set up a simple network with a [4, 10, 2] structure, which predicts a binary outcome.

I used 3 real_valued_columns and a single sparse_column_with_keys (wrapped in an embedding_column) as features:

def build_estimator(optimizer=None, activation_fn=tf.sigmoid):
    """Build an estimator"""
    # Sparse base columns
    column_stay_point = tf.contrib.layers.sparse_column_with_keys(
        column_name='stay_point',
        keys=['no', 'yes'])

    # Continuous base columns
    column_heading = tf.contrib.layers.real_valued_column('heading')
    column_velocity = tf.contrib.layers.real_valued_column('velocity')
    column_acceleration = tf.contrib.layers.real_valued_column('acceleration')

    pedestrian_feature_columns = [column_heading, 
                                  column_velocity, 
                                  column_acceleration,
                                  tf.contrib.layers.embedding_column(
                                      column_stay_point, 
                                      dimension=8, 
                                      initializer=tf.truncated_normal_initializer)]

    # Create classifier
    estimator = tf.contrib.learn.DNNClassifier(
        hidden_units=[10],
        feature_columns=pedestrian_feature_columns,
        model_dir='./tmp/pedestrian_model',
        n_classes=2,
        optimizer=optimizer,
        activation_fn=activation_fn)

    return estimator

I called this function with default arguments and used estimator.fit(...) to train the DNN. Aside from some warnings concerning the deprecated 'scalar_summary' function, it ran successfully and produced reasonable results. I printed all variables of the model by using the following line:

var = {k: estimator.get_variable_value(k) for k in estimator.get_variable_names())

I expected to get a weight matrices of size 10x4 and 2x10 as well as bias matrices of size 10x1 and 2x1. But I got the following:

'dnn/binary_logistic_head/dnn/learning_rate': 0.05 (actual value, scalar)

'dnn/input_from_feature_columns/stay_point_embedding/weights': 2x8 array

'dnn/hiddenlayer_0/weights/hiddenlayer_0/weights/part_0/Adagrad': 11x10 array

'dnn/input_from_feature_columns/stay_point_embedding/weights/int_embedding/weights/part_0/Adagrad': 2x8 array

'dnn/hiddenlayer_0/weights': 11x10 array

'dnn/logits/biases': 1x1' array

'dnn/logits/weights/nn/dnn/logits/weights/part_0/Adagrad': 10x1 array

'dnn/logits/weights': 10x1 array

'dnn/logits/biases/dnn/dnn/logits/biases/part_0/Adagrad': 1x1 array

'global_step': 5800, (actual value, scalar)

'dnn/hiddenlayer_0/biases': 1x10 array

'dnn/hiddenlayer_0/biases//hiddenlayer_0/biases/part_0/Adagrad': 1x10 array

Is there any documentation what these cryptic names mean and why do the matrices have these weird dimensions? Also, why are there references to the Adagrad optimizer despite never specifying it?

Any help is highly appreciated!

Upvotes: 1

Views: 450

Answers (1)

Manavender
Manavender

Reputation: 26

The number of input nodes in your network is 11 and not 4 8(embedding_column)+column_heading(1),column_velocity(1),column_acceleration(1) = 11

And based on the variable names the output is a binary logistic node, so the number of output nodes is only one and not 2.

Below are the weights/biases you are interested in.

dnn/hiddenlayer_0/weights': 11x10 array --> There are the weights from inputs to hidden nodes

dnn/hiddenlayer_0/biases': 1x10 array --> Biases of hidden nodes

dnn/logits/weights': 10x1 array --> Weights from hidden nodes to the output node

dnn/logits/biases': 1x1' array --> Bias of the output node.

why are there references to the Adagrad optimizer despite never specifying it?
Most probably the default optimizer is AdaGrad.

Upvotes: 1

Related Questions