Reputation: 61
I'm dealing with a multi-level dictionary in Julia. The outermost dictionary is Dict{String, Any}, which looks like this:
`Dict{String, Any} with 38 entries:
"2024-11-15:169" => Dict{String, Any}("395.0"=>Any[Dict{String, Any}("mini"=>…
"2024-06-28:29" => Dict{String, Any}("418.0"=>Any[Dict{String, Any}("mini"=>…
"2025-06-20:386" => Dict{String, Any}("750.0"=>Any[Dict{String, Any}("mini"=>…`
Each value is also a dictionary Dict{String, Any} with:
`Dict{String, Any} with 112 entries:
"475.0" => Any[Dict{String, Any}("mini"=>false, "settlementType"=>"P", "low52…
"500.0" => Any[Dict{String, Any}("mini"=>false, "settlementType"=>"P", "low52…
"456.0" => Any[Dict{String, Any}("mini"=>false, "settlementType"=>"P", "low52…`
Ultimately, I want to create a dictionary that has values which are dataframes. I.e. assign to each key from the outermost dictionary a dataframe based on the inner dictionary (With columns like "strike"=[475, 500, 456] or "mini"=[false, false, false] for example).
What does the Any[...] mean in the inner dictionary? How can I get rid of it because I only have dictionary types inside? Then, how do I efficiently collect the info from each dictionary into dataframe columns?
Thanks.
I tried just parsing it with DataFrame(outerdict) but it creates some kind of weird multi-level column index, again with the Any's. Also, with dictionaries as dataframe values it feels quite inefficient to manually create new columns every time from the dictionary entries.
Upvotes: 1
Views: 64
Reputation: 42214
It is not clear from your question what you need.
So let's assume we have a nested dictionary:
julia> d = Dict(:df1=>[Dict(:a=>10,:b=>20),Dict(:a=>11,:b=>21)], :df2=>[Dict(:a=>10,:b=>20),Dict(:a=>11,:b=>21)])
Dict{Symbol, Vector{Dict{Symbol, Int64}}} with 2 entries:
:df2 => [Dict(:a=>10, :b=>20), Dict(:a=>11, :b=>21)]
:df1 => [Dict(:a=>10, :b=>20), Dict(:a=>11, :b=>21)]
We can convert it to a dictionary of data frames:
julia> dfs = Dict(keys(d) .=> DataFrame.(values(d)))
Dict{Symbol, DataFrame} with 2 entries:
:df2 => 2×2 DataFrame…
:df1 => 2×2 DataFrame…
Were a single data frame looks like this:
julia> dfs[:df1]
2×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 10 20
2 │ 11 21
The trick is the correct usage of broadcasting.
The Any[]
in the inner dictionary means a Vector
of elements of type Any
. Any
is an abstract type - a supertype for any Julia type. In my example having d = Dict(:df1=>Any[Dict(:a=>10,:b=>20),Dict(:a=>11,:b=>21)], :df2=>Any[Dict(:a=>10,:b=>20),Dict(:a=>11,:b=>21)])
would not have changed anything (except for performance which would be lower as the usage of abstract containers is not recommended in Julia).
Upvotes: 0