Reputation: 77
I'm new to the Julia programming language, and still learning it by writing code that I've already written in Python (or, at least, tried out in Python).
There is an article which explains how to make a very simple neural network: https://medium.com/technology-invention-and-more/how-to-build-a-simple-neural-network-in-9-lines-of-python-code-cc8f23647ca1.
I tried the code in this article out in Python, at it's working fine. However, I haven't used linear algebra things in Python before (like dot). Now I'm trying to translate this code to Julia, but there are some things I can't understand. Here is my Julia code:
using LinearAlgebra
synaptic_weights = [-0.16595599, 0.44064899, -0.99977125]::Vector{Float64}
sigmoid(x) = 1 / (1 + exp(-x))
sigmoid_derivative(x) = x * (1 -x)
function train(training_set_inputs, training_set_outputs, number_of_training_iterations)
global synaptic_weights
for (iteration) in 1:number_of_training_iterations
output = think(training_set_inputs)
error = training_set_outputs .- output
adjustment = dot(transpose(training_set_inputs), error * sigmoid_derivative(output))
synaptic_weights = synaptic_weights .+ adjustment
end
end
think(inputs) = sigmoid(dot(inputs, synaptic_weights))
println("Random starting synaptic weights:")
println(synaptic_weights)
training_set_inputs = [0 0 1 ; 1 1 1 ; 1 0 1 ; 0 1 1]::Matrix{Int64}
training_set_outputs = [0, 1, 1, 0]::Vector{Int64}
train(training_set_inputs, training_set_outputs, 10000)
println("New synaptic weights after training:")
println(synaptic_weights)
println("Considering new situation [1, 0, 0] -> ?:")
println(think([1 0 0]))
I've already tried to initialize vectors (like synaptic_weights) as:
synaptic_weights = [-0.16595599 ; 0.44064899 ; -0.99977125]
However, the code is not working. More exactly, there are 3 things that is not clear for me:
When I try to run the Julia code above, I get the following error:
ERROR: LoadError: DimensionMismatch("first array has length 12 which does not match the length of the second, 3.")
Stacktrace:
[1] dot(::Array{Int64,2}, ::Array{Float64,1}) at C:\Users\julia\AppData\Local\Julia-1.0.3\share\julia\stdlib\v1.0\LinearAlgebra\src\generic.jl:702
[2] think(::Array{Int64,2}) at C:\Users\Viktória\Documents\julia.jl:21
[3] train(::Array{Int64,2}, ::Array{Int64,1}, ::Int64) at C:\Users\Viktória\Documents\julia.jl:11
[4] top-level scope at none:0
in expression starting at C:\Users\Viktória\Documents\julia.jl:28
This error comes when the funtion think(inputs) tries to compute the dot product of inputs and synaptic_weights. In this case, inputs is a 4x3 matrix and synaptic weights is a 3x1 matrix (vector). I know that they can be multiplied, and the result will become a 4x1 matrix (vector). Doesn't this mean that they dot product can be computed?
Anyway, that dot product can be computed in Python using the numpy package, so I guess there is a certain way that it can also be computed in Julia.
For the dot product, I also tried to make a function that takes a and b as arguments, and tries to compute their dot product: first, computes the product of a and b, then returns the sum of the result. I'm not sure whether it's a good solution, but the Julia code didn't produce the expected result when I used that function, so I removed it.
Can you help me with this code, please?
Upvotes: 3
Views: 207
Reputation: 69939
Here is the code adjusted to Julia:
sigmoid(x) = 1 / (1 + exp(-x))
sigmoid_derivative(x) = x * (1 -x)
think(synaptic_weights, inputs) = sigmoid.(inputs * synaptic_weights)
function train!(synaptic_weights, training_set_inputs, training_set_outputs,
number_of_training_iterations)
for iteration in 1:number_of_training_iterations
output = think(synaptic_weights, training_set_inputs)
error = training_set_outputs .- output
adjustment = transpose(training_set_inputs) * (error .* sigmoid_derivative.(output))
synaptic_weights .+= adjustment
end
end
synaptic_weights = [-0.16595599, 0.44064899, -0.99977125]
println("Random starting synaptic weights:")
println(synaptic_weights)
training_set_inputs = Float64[0 0 1 ; 1 1 1 ; 1 0 1 ; 0 1 1]
training_set_outputs = Float64[0, 1, 1, 0]
train!(synaptic_weights, training_set_inputs, training_set_outputs, 10000)
println("New synaptic weights after training:")
println(synaptic_weights)
println("Considering new situation [1, 0, 0] -> ?:")
println(think(synaptic_weights, Float64[1 0 0]))
There are multiple changes so if some of them are not clear to you please ask and I will expand on them.
The most important things I have changed:
Float64
element type.
(e.g. sigmoid
and sigmoid_derivative
functions are defined in such a way that they expect to get a number as an argument, therefore when we call them .
is added after their name to trigger broadcasting)*
instead of dot
The code runs around 30x faster than the original implementation in Python. I have not squeezed out maximum performance for this code (now it does a lot of allocations which can be avoided) as it would require to rewrite its logic a bit and I guess you wanted a direct reimplementation.
Upvotes: 10