Alejo
Alejo

Reputation: 13

Unexpected behavior in Julia boolean comparison

I'm testing different parametrization of the CDF of the logistic function and comparing the results and the effect on the curve of the different parameters.

using Distributions

# Vector of x to test the different functions
x = collect(0:20)

Logis = Logistic(10, 1)  # PDF of Logistic function in Julia
y = cdf(Logis, x)       # CDF of Logistic function in Julia

# This is a standard representation of the CDF for Logistic
LogisticV1(x, μ=10, θ=1) = 1 / ( 1 + e^-((x-μ)/θ))   
y1 = LogisticV1.(x)

# This is another representation of the CDF for Logistic
LogisticV2(x, μ=10, θ=1) = 1/2 + 1/2 * tanh((x-μ)/2*θ)
y2 = LogisticV2.(x)

The plots of all three functions are identical, as expected. The type of all three y vectors is also the same (Array{Float64,1}) and the three y vectors also appear to be identical.

show(y)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955]

show(y1)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955]

show(y2)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955]

However:

y == y1    # true
y == y2    # false
y1 == y2   # false

Why is this happening? I assume this has something to do with floating point variations introduced by the tanh function in LogisticV2, but I'm not sure. I appreciate any insight into this.

EDIT: Fixed some typos to make code runnable

Upvotes: 1

Views: 191

Answers (2)

carstenbauer
carstenbauer

Reputation: 10147

To compare floating point numbers use isapprox rather than ==.

In your case, you will see that isapprox(y,y1) == isapprox(y,y2) == isapprox(y1,y2) == true. Furthermore, you can check maximum(abs.(y-y2)) to see that the difference is of the order of floating point precision (I find 1.1102230246251565e-16). (Note, however, that isapprox by default checks the relative deviation)

Upvotes: 1

HarmonicaMuse
HarmonicaMuse

Reputation: 7893

I assume this has something to do with floating point variations introduced by the tanh function in LogisticV2

You are correct:

julia> (y .== y1)'
1×21 RowVector{Bool,BitArray{1}}:
 true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true

julia> (y .== y2)'
1×21 RowVector{Bool,BitArray{1}}:
 false  false  false  false  false  false  false  false  false  true  true  true  false  false  true  false  false  true  false  false  false

But:

julia> y ≈ y2    # \approx<TAB> for: ≈ symbol
true

is an Unicode alias for isapprox:

help?> ≈

"≈" can be typed by \approx

search: ≈

isapprox(x, y; rtol::Real=sqrt(eps), atol::Real=0, nans::Bool=false, norm::Function)

Inexact equality comparison: true if norm(x-y) <= atol + tol*max(norm(x), norm(y)). The default atol is zero and the default rtol depends on the types of x and y. The keyword argument nans determines whether or not NaN values are considered equal (defaults to false).

For real or complex floating-point values, rtol defaults to sqrt(eps(typeof(real(x-y)))). This corresponds to requiring equality of about half of the significand digits. For other types, rtol defaults to zero.

x and y may also be arrays of numbers, in which case norm defaults to vecnorm but may be changed by passing a norm::Function keyword argument. (For numbers, norm is the same thing as abs.) When x and y are arrays, if norm(x-y) is not finite (i.e. ±Inf or NaN), the comparison falls back to checking whether all elements of x and y are approximately equal component-wise.

The binary operator is equivalent to isapprox with the default arguments, and x ≉ y is equivalent to !isapprox(x,y).

julia> 0.1 ≈ (0.1 - 1e-10)   
true

julia> isapprox(10, 11; atol = 2)
true

julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])   
true

Upvotes: 0

Related Questions