NoviceStat
NoviceStat

Reputation: 135

How to change a value to missing

I seem to be unable to change a value to missing in Julia version 0.6.4 (I believe it was allowed before 0.6).

Example code:

using Dataframes
x = zeros(5)
5-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
x[3] = missing
ERROR: MethodError: Cannot `convert` an object of type Missings.Missing to an     
object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] setindex!(::Array{Float64,1}, ::Missings.Missing, ::Int64) at ./array.jl:583

In this setting I am trying to encode certain indicies as missing values for an analysis. Is there a simple workaround?

Upvotes: 5

Views: 2895

Answers (1)

Colin T Bowers
Colin T Bowers

Reputation: 18530

missing in Julia is of its own type:

julia> typeof(missing)
Missings.Missing

In your case, it is particularly important to note that:

julia> Missing <: Float64
false

That is, Missing is not a subtype of Float64. Now, note that:

julia> typeof(zeros(5))
Array{Float64,1}

So you construct x, an array that should only contain Float64. Since missing is not a subtype of Float64, when you try to change one of the elements of x to missing, you get an error, in the same way you would get an error if you tried x[3] = "a string".

If you want an array to contain both the type Missing and the type Float64, then you need to specify up front that the elements of the array can be of type Missing or type Float64. In Julia v0.6 (which you specify in the question), you can do this via missings, which is located in the Missings.jl package, e.g.:

julia> x = missings(Float64, 2)
2-element Array{Union{Float64, Missings.Missing},1}:
 missing
 missing

julia> x[1] = 0.0
0.0

julia> x
2-element Array{Union{Float64, Missings.Missing},1}:
 0.0     
  missing

In v1.0, the core functionality related to missing was moved into Base, so instead you would need:

julia> Array{Union{Float64,Missing}}(missing, 2)
2-element Array{Union{Missing, Float64},1}:
 missing
 missing

which is admittedly a little cumbersome. However, the missings syntax from v0.6 is still available for v1.0 in Missings.jl. It's just that many people may choose not to bother with this since the type Missing itself has moved to Base, so you don't need Missings.jl, unlike v0.6.

If you already have a pre-existing Array{Float64} and want to mark some of the elements as missing, then (as far as I know) you will need to re-construct the array. For example, in both v0.6 and v1.0 you could use:

julia> x = randn(2)
2-element Array{Float64,1}:
 -0.642867
 -1.17995 

julia> y = convert(Vector{Union{Missing,Float64}}, x)
2-element Array{Union{Float64, Missings.Missing},1}:
 -0.642867
 -1.17995 

julia> y[2] = missing
missing

Note that missing is typically envisaged to be used in datatypes like DataFrames, where a lot of this stuff happens automatically for you, and so you don't have to waste time typing out so many Unions. This might be one reason why the syntax is a little verbose when working with regular arrays like you are.

One final point: you could of course explicitly construct your arrays to accept any type, e.g. x = Any[1.0, 2.0] ; x[1] = missing. The downside is that now the compiler cannot generate type-efficient code for working with x and so you will lose the speed benefits of working in Julia.

Upvotes: 5

Related Questions