Reputation: 1056
I have been reading papers about LSTM and checking its implementations. There is one point that is not clear to me.
In most of the papers it is mentioned that the weight matrices from the cell to gate vectors should be diagonal(ex: Alex page 5, 2013), but I haven't seen this in any implementation.
For example this :
1
2
Another example is from mila lab.
3
Are these people implementing wrongly or am I missing something?
Upvotes: 7
Views: 2805
Reputation: 1557
The TensorFlow implementation does use a diagonal matrix, see here. Note that what this means in practice is that the peepholes only go from the cell to itself, and so you're doing elementwise vector multiplies.
Upvotes: 5