Reputation: 3208
I am working through a python book.. but using Julialang instead.. in order to learn the language etc... and I have come upon another area here where I am not quite clear ..
but when i start tossing more complex matrices it fell apart..
include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")
coords, color = spiral_data(100, 3)
dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)
forward(dense1, coords)
println("Forward 1 layer")
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
println("Forward 2 layer")
activated_output2 = softmax_activation(dense2.output)
println("\n", activated_output2)
I get a proper matrix back
julia> activated_output2
300×3 Matrix{Float64}:
0.00333346 0.00333337 0.00333335
0.00333345 0.00333337 0.00333335
0.00333345 0.00333336 0.00333335
0.00333344 0.00333336 0.00333335
0.00333343 0.00333336 0.00333334
0.00333311 0.00333321 0.00333322
but the book has
>>>
[[0.33333 0.3333 0.3333]
...
Seems I am an order of magnitude lower than the book? even when using FluxMLs softmax function
EDIT:
I thought maybe my ReLU activation code was causing the discrepancy.. and tried switching to the FluxML NNlib version... but get same activated_output2
with 0.0033333
instead of 0.333333
will keep checking other parts like my forward function
EDIT2:
Adding my DenseLayer
implementation for completeness
DenseLayer
# see https://github.com/FluxML/Flux.jl/blob/b78a27b01c9629099adb059a98657b995760b617/src/layers/basic.jl#L71-L111
using Base: Integer, Float64
mutable struct LayerDense
weights::Matrix{Float64}
biases::Matrix{Float64}
num_inputs::Integer
num_neurons::Integer
output::Matrix{Float64}
LayerDense(num_inputs::Integer, num_neurons::Integer) = new(0.01 * randn(num_inputs, num_neurons), zeros((1, num_neurons)),num_inputs, num_neurons)
end
function forward(layer::LayerDense, inputs::Matrix{Float64})
layer.output = inputs * layer.weights .+ layer.biases
end
EDIT3:
Using the library.. I started inspecting my spiral_data
implementation.. seems within reason
Python
import numpy as np
import nnfs
from nnfs.datasets import spiral_data
nnfs.init()
X, y = spiral_data(samples=100, classes=3)
print(X[:4]). # just check the first couple
>>>
[[0. 0. ]
[0.00299556 0.00964661]
[0.01288097 0.01556285]
[0.02997479 0.0044481 ]]
JuliaLang
include("activation_function_exercise/spiral_data.jl")
coords, color = spiral_data(100, 3)
julia> coords
300×2 Matrix{Float64}:
0.0 0.0
-0.00133462 0.0100125
0.00346739 0.0199022
-0.00126302 0.0302767
0.00184948 0.0403617
0.0113095 0.0492225
0.0397276 0.0457691
0.0144484 0.0692151
0.0181726 0.0787382
0.0320308 0.0850793
Upvotes: 1
Views: 133
Reputation: 3208
turned out I was using the NNlib
softmax on the entire matrix.. which the python book was NOT doing.. and all in needed to do was to modify my softmax()
call likeso
using NNlib
function softmax_activation(inputs)
return softmax(inputs, dims=2)
end
Then the output at the end of my big long example comes out as expected
#using Pkg
#Pkg.add("Plots")
include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")
coords, color = spiral_data(100, 3)
dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)
# Julia doesn't lend itself to OO programming...
# so the following will just be function
# activation1 = activation_relu
# activation2 = activation_softmax
forward(dense1, coords)
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
activated_output2 = softmax_activation(dense2.output)
using Plots
#scatter(coords[:,1], coords[:,2])
scatter(coords[:,1], coords[:,2], zcolor=color, framestyle=:box)
display(activated_output2)
300×3 Matrix{Float64}:
0.333333 0.333333 0.333333
0.333336 0.333334 0.33333
0.333338 0.333339 0.333323
0.33334 0.333344 0.333316
0.333339 0.333361 0.3333
0.333341 0.333365 0.333294
0.333345 0.333362 0.333293
0.333345 0.333374 0.333281
0.333349 0.33337 0.333281
0.333347 0.33339 0.333262
⋮
0.333564 0.332673 0.333764
0.333583 0.332885 0.333532
0.333588 0.332967 0.333445
0.333587 0.333148 0.333265
0.333593 0.332935 0.333472
0.333596 0.333006 0.333398
0.333583 0.33333 0.333086
0.3336 0.333062 0.333338
0.333603 0.333082 0.333316
Upvotes: 1