Reputation: 3278
I am trying to do a very simple Logistic regression in Julia. But Julia's typing system seems to be causing me problems. Basically, glm predict gives me an array of probabilities. I want to do a simple round so that if the probability >= 0.5, it is a 1, otherwise a 0. I would like those labels to also be integers.
No matter what I do, I can't convert the DataArray returned by predict to Int64. If I create an adhoc DataArray, I can round it just fine. Even though they both show a type of DataArrays.DataArray{Float64,1}. I've also tried things like pred>0.5, but that fails similarly. Clearly there is some magic with the return value from predict, beyond the type, that makes it different than the other DataArray in my short program.
using DataFrames;
using GLM;
df = readtable("./data/titanic-dataset.csv");
delete!(df, :PassengerId);
delete!(df, :Name);
delete!(df, :Ticket);
delete!(df, :Cabin);
pool!(df, [:Sex]);
pool!(df, [:Embarked]);
df[isna.(df[:Age]),:Age] = median(df[ .~isna.(df[:Age]),:Age])
model = glm(@formula(Survived ~ Pclass + Sex + Age + SibSp + Parch + Fare + Embarked), df, Binomial(), LogitLink());
pred = predict(model,df);
z = DataArray([1.0,2.0,3.0]);
println(typeof(z));
println(typeof(pred));
println(round.(Int64,z)); # Why does this work?
println(round.(Int64,pred)); # But this does not?
The output is:
DataArrays.DataArray{Float64,1}
DataArrays.DataArray{Float64,1}
[1, 2, 3]
MethodError: no method matching round(::Type{Int64}, ::DataArrays.NAtype)
Closest candidates are:
round(::Type{T<:Integer}, ::Integer) where T<:Integer at int.jl:408
round(::Type{T<:Integer}, ::Float16) where T<:Integer at float.jl:338
round(::Type{T<:Union{Signed, Unsigned}}, ::BigFloat) where T<:Union{Signed, Unsigned} at mpfr.jl:214
...
Stacktrace:
[1] macro expansion at C:\Users\JHeaton\.julia\v0.6\DataArrays\src\broadcast.jl:32 [inlined]
[2] macro expansion at .\cartesian.jl:64 [inlined]
[3] macro expansion at C:\Users\JHeaton\.julia\v0.6\DataArrays\src\broadcast.jl:111 [inlined]
[4] _broadcast!(::DataArrays.##116#117{Int64,Base.#round}, ::DataArrays.DataArray{Int64,1}, ::DataArrays.DataArray{Float64,1}) at C:\Users\JHeaton\.julia\v0.6\DataArrays\src\broadcast.jl:67
[5] broadcast!(::Function, ::DataArrays.DataArray{Int64,1}, ::Type{Int64}, ::DataArrays.DataArray{Float64,1}) at C:\Users\JHeaton\.julia\v0.6\DataArrays\src\broadcast.jl:169
[6] broadcast(::Function, ::Type{T} where T, ::DataArrays.DataArray{Float64,1}) at .\broadcast.jl:434
[7] include_string(::String, ::String) at .\loading.jl:515
Upvotes: 0
Views: 514
Reputation: 8044
You can't create integers when you have NA
s in z
. You can round.
them (in which case you'll get a DataArray
of Float
s), but when you try to make them Int
it will complain because NA
can't be Int64
.
Instead do
convert(DataArray{Int}, round.(z))
Also, it is nicer to post an example using data available in a package rather than a local dataset on your computer.
Upvotes: 2