padmarajravi
padmarajravi

Reputation: 41

How to use the Spark Mlib Multilayer Perceptron Weights Array

I have a requirement where i need to find the relative importance of the attributes used in ANN implementation. I use the spark MLib library MultiLayerPerceptron for implementation. The model gives me a vector which is an array of the weights. I know there are algorithms to derive the relative importance from weights , but the MLib implementation gives out a big single dimensional array and does not tell anything about the weights corresponding to each input. Anyone know how to get the weights corresponding to each input node?

Upvotes: 4

Views: 1014

Answers (1)

Jonathan H.
Jonathan H.

Reputation: 56

The model flattens the weights matrices with the Breeze manipulation: toDenseVector. (notice the line: val brzWeights: BV[Double] = weightsOld.asBreeze.toDenseVector)

This manipulation acts like numpy's flatten(). Therefore, to retrieve the weights matrices, you have to do two things:

  1. Split the weights vector to parts, according to your layers. You have to take (layerSize + 1) * nextLayerSize weights per each non-final layer (+1 because of the bias).
  2. For each flattened weight matrix, apply numpy's reshape with parameters (layerSize + 1, nextLayerSize).

When you derive the relative importance from your weights, notice that in the pyspark implementation, the bias is represented as the last feature: from the docs:.

Therefore the last row in each weight matrix represents the bias value.

Upvotes: 2

Related Questions