emonigma
emonigma

Reputation: 4426

Julia: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp

I am building a DataFrame row by row and then running a regression on it. For simplicity, the code is:

using DataFrames
using GLM

df = DataFrame(response = Number[])
for i in 1:10
    df = vcat(df, DataFrame(response = rand()))
end

fit(LinearModel, @formula(response ~ 1), df)

I get the error:

ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp
This may have arisen from a call to the constructor GLM.LmResp(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] fit(::Type{GLM.LinearModel}, ::Array{Float64,2}, ::Array{Number,1}) at ~/.julia/v0.6/GLM/src/lm.jl:140
 [2] #fit#44(::Dict{Any,Any}, ::Array{Any,1}, ::Function, ::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:72
 [3] fit(::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:66
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
while loading ~/test.jl, in expression starting on line 10

The call to the linear regression is very similar to regression in "Introducing Julia":

linearmodel = fit(LinearModel, @formula(Y1 ~ X1), anscombe)

What is the problem?

Upvotes: 3

Views: 2665

Answers (1)

emonigma
emonigma

Reputation: 4426

After a few hours, I realized that GLM requires concrete types and Number is an abstract type (even though the documentation for GLM.LmResp says little about this at the time of this writing, only "Encapsulates the response for a linear model"). The solution is to change the declaration to a concrete type, such as Float64:

using DataFrames
using GLM

df = DataFrame(response = Float64[])
for i in 1:10
    df = vcat(df, DataFrame(response = rand()))
end

fit(LinearModel, @formula(response ~ 1), df)

Output:

StatsModels.DataFrameRegressionModel{GLM.LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}

Formula: response ~ +1

Coefficients:
             Estimate Std.Error t value Pr(>|t|)
(Intercept)  0.408856 0.0969961 4.21518   0.0023

The type has to be concrete, e.g. the abstract type Real with df = DataFrame(response = Real[]) fails with a more helpful error message:

ERROR: LoadError: `float` not defined on abstractly-typed arrays; please convert to a more specific type

Alternatively, you can convert to Real after building the dataframe:

using DataFrames
using GLM

df = DataFrame(response = Number[])
for i in 1:10
    df = vcat(df, DataFrame(response = rand()))
end

df2 = DataFrame(response = map(Real, df[:response]))

fit(LinearModel, @formula(response ~ 1), df2)

This works because converting to Real actually converts to Float64:

julia> typeof(df2[:response])
Array{Float64,1}

I filed an issue with GLM to improve the error message.

Upvotes: 1

Related Questions