Ben
Ben

Reputation: 437

Neural Networks - Checking for node activation

I'm involved in a research project that is looking at using Neural Networks in a safety critical environment. Part of the regulatory framework this research is targeted towards states that there must be no dormant code within the system. There must be a pathway through every part of the system and that pathway must be testable/verifiable.

Obviously the neural network is comprised of many nodes. The input/output nodes are easy to test for activation, but does anyone know of a method of testing activation of hidden layer nodes?

Obviously the activation is dependent of the node's input values and activation function and there may be a mathematical approach to this.

Ultimately the code will be in C/C++, but we're doing the NN development in Python. So any ideas involving related toolsets would be gratefully received. I could also export/import the NN structure and matrices to another package or environment if that helps with this testing.

Hopefully you'll all be overflowing with ideas, because Google didn't offer anything. :(

Thanks.

Upvotes: 2

Views: 345

Answers (2)

I look at the gradients to see if nodes were activated, although this solution requires modification depending on the activation functions. I only look at the gradients for the bias in each node because all of the information we need is contained there. You could use the weights, but using biases is simpler.

For example (pyTorch):

current_layer = []  
for name, params in neural_network.named_parameters():  
    if "bias" in name:  
        current_layer.append(params.grad.count_nonzero().item() / len(params))  

(Sorry if the formatting is off, submitting this on a phone)

In this code, I'm getting the percentage of nodes activated per layer. I count non-zero because my activation functions are ReLU and a node (weights and bias) will have zero gradient if not activated in ReLU, but you could count negative gradients for Leaky ReLU or whatever specific condition based on the activation function derivative.

If you wanted to identify the locations of nodes that never activate, you could adjust the code so that indices for every node are stored and you remove the indices when they are activated. That'll give you a list of never activated nodes.

Upvotes: 0

Ben
Ben

Reputation: 437

Somehow techytushar's comment nudged my brain into a new line of reasoning, which I think has been very helpful:

So the problem I'm addressing is: 'There can be no dormant code.' Be that lines of C or array elements that are never, and can never, be accessed.

So when the trained NN runs as a compiled C application, the application will calculate the value for each neuron and evaluate its activation function irrespective of the node's input value(s). So actually there is no such thing as dormant code or array elements in this regard. Just a true/false output for that node's activation at that moment. It might change in the next moment. It'll all be re calculated, even if mathmatically the result is always for no activation.

So the question then moves away from this subject, to ensuring that no combinations of node activation can result in the system being in a dangerous state. That's off topic of the original question, so I think I can draw a line under this...?

Upvotes: 1

Related Questions