Reputation: 18560
I want to store high-frequency financial data in memory while I work with it in Julia.
My data is in lots of arrays of Float64. Each array stores high frequency data from a single day, for some security, on some market. For example, for the date 2010-01-04, for IBM, listed on the NYSE (New York Stock Exchange), there is one array of Float64.
As stated, I have many such arrays, spanning multiple dates, markets, and securities. I want to store them all in one object, such that it is easy to retrieve any given array (probably exploiting the tree-like structure of metadata).
In Matlab, I used to store this in a structure, where the first level is market, next level is security, next level is date, and then at the end of tree is the corresponding array. At each level I also stored a list of fields at that level.
Julia doesn't really have an equivalent to Matlab structures, so what is the best way to do this in Julia?
Currently, the best I can come up with is a sequence of nested composite types, each with two fields. For example:
type HighFrequencyData
dateList::Array{Date, 1}
dataArray::Array{Any, 1}
end
where dateList
stores a list of dates that correspond to a sequence of arrays of Float64 held in dataArray
(i.e. dateList
and dataArray
will have the same length). Then:
type securitiesData
securityList::Array{String, 1}
highFrequencyArray::Array{Any, 1}
end
where securityList
stores a list of securities that correspond to a sequence of type HighFrequencyData
held in highFrequencyArray
. Then:
type marketsData
marketList::Array{String, 1}
securitiesArray::Array{Any, 1}
end
where marketList
stores a list of markets that correspond to a sequence of type securitiesData
held in securitiesArray
.
Given this, all data can now be stored in a variable of type marketsData
, and looked up using marketList
, securityList
, and dateList
, at each level of nesting.
But this feels a bit cumbersome...
Upvotes: 3
Views: 2268
Reputation: 11664
Your type hierarchy looks ok, but maybe dictionaries are all you need?
all_data = ["Market1" => {
["Sec1" => {[20140827, 20140825], [1.05, 10.6]}],
["Sec2" => {[20140827, 20140825], [1.05, 10.6]}]},
"Market2" => {
["Sec1" => {[20140827, 20140825], [1.05, 10.6]}],
["Sec2" => {[20140827, 20140825], [1.05, 10.6]}]},
...]
println(all_data["Market1"]["Sec1"] ./ all_data["Market2"]["Sec1"])
If you could post what the MATLAB code looks like that might be helpful too.
I would reformulate your types a little bit, maybe something simpler like
type TimeSeries
dates::Vector{Date}
data::Vector{Any}
end
typealias Security (String,TimeSeries)
typealias Market Vector{Security}
markets = Market[]
push!(markets, [("Sec1",TimeSeries(...)), ("Sec2",TimeSeries(...)])
Also, make sure to check out https://github.com/JuliaStats/TimeSeries.jl
Upvotes: 5