han-tyumi
han-tyumi

Reputation: 245

Create array of values based on dictionary and array of keys

I'm new to Julia, so I'm sorry if this is a basic question.

Say we have a dictionary, and a vector of keys:

X = [2, 1, 1, 3]
d = Dict( 1 => "A", 2 => "B", 3 => "C")

I want to create a new array which contains values instead of keys (according to the dictionary), so the end result would be something like

Y = ["B", "A", "A", "C"]

I suppose I could iterate over the vector elements, look it up in the dictionary and return the corresponding value, but this seems awfully inefficient to me. Something like

Y = Array{String}(undef, length(X))
for i in 1:length(X)
    Y[i] = d[X[i]]
end

EDIT: Also, my proposed solution doesn't work if X contains missing values.

So my question is if there is some more efficient way of doing this (I'm doing it with a much larger array and dictionary), or is this an appropriate way of doing it?

Upvotes: 4

Views: 640

Answers (2)

StefanKarpinski
StefanKarpinski

Reputation: 33259

You can use an array comprehension to do this pretty tersely:

julia> [d[x] for x in X]
4-element Array{String,1}:
 "B"
 "A"
 "A"
 "C"

In the future it may be possible to write d.[X] to express this even more concisely, but as of Julia 1.3, that is not yet allowed.

As per the edit to the question, let's suppose there is a missing value somewhere in X:

julia> X = [2, 1, missing, 1, 3]
5-element Array{Union{Missing, Int64},1}:
 2
 1
  missing
 1
 3

If you want to map missing to missing or some other value like the string "?" you can do that explicitly like this:

julia> [ismissing(x) ? missing : d[x] for x in X]
5-element Array{Union{Missing, String},1}:
 "B"
 "A"
 missing
 "A"
 "C"

julia> [ismissing(x) ? "?" : d[x] for x in X]
5-element Array{String,1}:
 "B"
 "A"
 "?"
 "A"
 "C"

If you're going to do that a lot, it might be easier to put missing in the dictionary like this:

julia> d = Dict(missing => "?", 1 => "A", 2 => "B", 3 => "C")
Dict{Union{Missing, Int64},String} with 4 entries:
  2       => "B"
  missing => "?"
  3       => "C"
  1       => "A"

julia> [d[x] for x in X]
5-element Array{String,1}:
 "B"
 "A"
 "?"
 "A"
 "C"

If you want to simply skip over missing values, you can use skipmissing(X) instead of X:

julia> [d[x] for x in skipmissing(X)]
4-element Array{String,1}:
 "B"
 "A"
 "A"
 "C"

There's generally not a single correct way to handle missing values, which is why you need to explicitly code how to handle missing data.

Upvotes: 5

Nils Gudat
Nils Gudat

Reputation: 13800

Efficiency can mean different things in different contexts, but I would probably do:

Y = [d[i] for i in X]

If X contains missing values, you could use skipmissing(X) in the comprehension.

Upvotes: 6

Related Questions