Reputation: 53
I have a dictionary of the form
"san-diego.new-york" => 0.225
"seattle.topeka" => 0.162
"san-diego.chicago" => 0.162
"seattle.new-york" => 0.225
"san-diego.topeka" => 0.126
"seattle.chicago" => 0.153
I want to transform this into a 2x3
matrix where i
is the set san-diego, seattle
and j
is the set new-york, topeka, chicago
. I've tried splitting the keys by using split.(keys(dict),".")
but didn't get anywhere.
I want to do this in order to do calculations of the form M[i][j]=0.5
afterwards.
edit: I made a new dictionary where the keys are tuples. I don't know if this helps.
c = Dict("san-diego.new-york" => 0.225, "seattle.topeka" => 0.162, "san-diego.chicago" => 0.162
, "seattle.new-york" => 0.225, "san-diego.topeka" => 0.126, "seattle.chicago" => 0.153)
a = split.(keys(c),".")
b = collect(values(c))
new_c = Dict((a[i][1],a[i][2])=>b[i] for i in 1:length(b))
I ended up writing the following function
function fillmatrix()
c = Dict("san-diego.new-york" => 0.225, "seattle.topeka" => 0.162, "san-diego.chicago" => 0.162
, "seattle.new-york" => 0.225, "san-diego.topeka" => 0.126, "seattle.chicago" => 0.153)
a = split.(keys(c),".")
b = collect(values(c))
new_c = Dict((a[i][1],a[i][2])=>b[i] for i=1:length(b))
list_i = []
list_j = []
for (u,v) in keys(new_c)
push!(list_i,u)
push!(list_j,v)
end
i = unique(list_i)
j = unique(list_j)
A = zeros((length(i),length(j)))
for ii in i
for jj in j
A[findfirst(x->x==ii,i),findfirst(x->x==jj,j)] = new_c[(ii,jj)]
end
end
return A
end
But this seems like a long workaround and I would like to generalize it to more dimensions. Any thoughts? Thanks in advance.
Upvotes: 1
Views: 492
Reputation: 69869
I will give a solution from your original dictionary (the other can be adjusted accordingly). You can use the NamedArrays.jl package to solve your problem. Here is a full solution:
using NamedArrays
d = Dict("san-diego.new-york" => 0.225,
"seattle.topeka" => 0.162,
"san-diego.chicago" => 0.162,
"seattle.new-york" => 0.225,
"san-diego.topeka" => 0.126,
"seattle.chicago" => 0.153)
s = split.(keys(d), '.')
row = unique(string.(getindex.(s, 1)))
col = unique(string.(getindex.(s, 2)))
m = NamedArray([d[r*"."*c] for r in row, c in col],
(row, col), ("from", "to"))
(this assumes that all row-column pairs are present otherwise instead of d[r*"."*c]
write get(d, r*"."*c, missing)
and you have missing values in entries that are not present in your dictionary)
And now you can write:
julia> m
2×3 Named Array{Float64,2}
from ╲ to │ new-york topeka chicago
──────────┼─────────────────────────────
san-diego │ 0.225 0.126 0.162
seattle │ 0.225 0.162 0.153
julia> m["san-diego", "new-york"]
0.225
julia> m[2,3]
0.153
(essentially you can use names or integer indices to reference columns/rows)
Also note that I convert row
and col
entries to String
but we could also leave them as SubString
s (i.e. omit string.
part in the call), but String
looks a bit nicer when printed as NamedArray
row/column.
Upvotes: 3