natedogg
natedogg

Reputation: 95

Efficiently compute arbitrary neural network output?

I'm learning about neural networks and they are some of the neatest things I've come across.

My question is: how do you compute the output of a neural network with arbitrary topology? Is there some algorithm or rule of thumb to use?

For example, I understand that feed-forward networks have straightforward matrix representations, but what about networks with loops or with outputs connected to inputs? Is there a matrix form for those? Or is the only way to produce output to do some kind of graph-traversal?

Example:

enter image description here

Upvotes: 3

Views: 456

Answers (1)

S. Stas
S. Stas

Reputation: 810

  1. Let's look at the picture with the your neural network structure, attached to your question.
    Artificial neural networks connections are not the common directed graph as it seems to be. There are additional restrictions implied here, such as different types of nodes distributed by layers.
    There are input nodes, hidden nodes and output nodes. In simple words, input nodes (neuron values) are considered to be read-only, and there are no way of modifications. That's why connection between nodes 9 and 4 is meaningless, as input 4 itself, because it's signal doesn't propagates further.
    The same is for connection between nodes 8 and 11. You may look here, basics of neural networks are explained in a simple way.

  2. Talking about networks with loops, we assume recurrent networks. Assume, we have recursive neural network shown at the picture below. recursive neural network How would we the output be calculated?
    We can try to apply the same calculation rule as for feed-forward network. formula,
    formula, here f - activation function.
    But, wait a second, don't we need to know formula and formula values? Technically, this is not a recursion.
    You can read it as “Next value of formula node depends on the current value of formula node ”.
    The dynamics of the network depicted can be visualized by “unfolding” (shown below). enter image description here So recurrent network can be treated as a deep network with single layer per time step and shared weights across time steps. Here, we treat step 0 hidden layers as inputs for step 1.
    Back to our case, formula for the first step calculation should look like formula.
    Similarly, for the second step could be calculated as formula.

To simplify, values of formula and formula could be initialized by zeros (though practically initial state is trained as model parameter). Unfolding is used for training (technically, we simply replace recurrent network with the series of feed-forward hidden layers).

So to conclude, recurrent networks still have matrix representations and operations, though it seems not obvious and straightforward.

Upvotes: 3

Related Questions