Reputation: 245
I'm new to Julia, so I'm sorry if this is a basic question.
Say we have a dictionary, and a vector of keys:
X = [2, 1, 1, 3]
d = Dict( 1 => "A", 2 => "B", 3 => "C")
I want to create a new array which contains values instead of keys (according to the dictionary), so the end result would be something like
Y = ["B", "A", "A", "C"]
I suppose I could iterate over the vector elements, look it up in the dictionary and return the corresponding value, but this seems awfully inefficient to me. Something like
Y = Array{String}(undef, length(X))
for i in 1:length(X)
Y[i] = d[X[i]]
end
EDIT: Also, my proposed solution doesn't work if X
contains missing
values.
So my question is if there is some more efficient way of doing this (I'm doing it with a much larger array and dictionary), or is this an appropriate way of doing it?
Upvotes: 4
Views: 640
Reputation: 33259
You can use an array comprehension to do this pretty tersely:
julia> [d[x] for x in X]
4-element Array{String,1}:
"B"
"A"
"A"
"C"
In the future it may be possible to write d.[X]
to express this even more concisely, but as of Julia 1.3, that is not yet allowed.
As per the edit to the question, let's suppose there is a missing
value somewhere in X
:
julia> X = [2, 1, missing, 1, 3]
5-element Array{Union{Missing, Int64},1}:
2
1
missing
1
3
If you want to map missing
to missing
or some other value like the string "?"
you can do that explicitly like this:
julia> [ismissing(x) ? missing : d[x] for x in X]
5-element Array{Union{Missing, String},1}:
"B"
"A"
missing
"A"
"C"
julia> [ismissing(x) ? "?" : d[x] for x in X]
5-element Array{String,1}:
"B"
"A"
"?"
"A"
"C"
If you're going to do that a lot, it might be easier to put missing
in the dictionary like this:
julia> d = Dict(missing => "?", 1 => "A", 2 => "B", 3 => "C")
Dict{Union{Missing, Int64},String} with 4 entries:
2 => "B"
missing => "?"
3 => "C"
1 => "A"
julia> [d[x] for x in X]
5-element Array{String,1}:
"B"
"A"
"?"
"A"
"C"
If you want to simply skip over missing values, you can use skipmissing(X)
instead of X
:
julia> [d[x] for x in skipmissing(X)]
4-element Array{String,1}:
"B"
"A"
"A"
"C"
There's generally not a single correct way to handle missing values, which is why you need to explicitly code how to handle missing data.
Upvotes: 5
Reputation: 13800
Efficiency can mean different things in different contexts, but I would probably do:
Y = [d[i] for i in X]
If X
contains missing
values, you could use skipmissing(X)
in the comprehension.
Upvotes: 6