Reputation: 101
I have this structure in Julia 1.0:
mutable struct Metadata
id::Int64
res_id::Int64
end
So that I can create an array of these, where the id
is always incremented by one, but the res_id
is only sometimes incremented, like so:
data = [
Metadata(1, 1),
Metadata(2, 1),
Metadata(3, 1),
Metadata(4, 2),
Metadata(5, 2),
Metadata(6, 2),
...]
What I want to do is be able to iterate over this Array, but get blocks based on the res_id
(all the data with res_id
1, then 2, etc). The desired behavior would be something like this:
for res in iter_res(data)
println(res)
end
julia>
[Metadata(1, 1), Metadata(2, 1), Metadata(3, 1)]
[Metadata(4, 2), Metadata(5, 2), Metadata(6, 2)]
How do I do this in Julia 1.0, considering that I also need to normally iterate over the array to get element by element?
Upvotes: 4
Views: 2986
Reputation: 941
In Julia 1+, this should be done by implementing Base.iterate(::YourType)
to get the starting iteration and Base.iterate(::YourType, state)
for other iterations based of some state
. These methods should return nothing
when done, otherwise, (result, state)
tuple.
Iterating on YourType with
for i in x
# stuff
end
is then a shorthand for writing
it = iterate(x)
while it !== nothing
i, state = it
# stuff
it = iterate(x, state)
end
See the manual for details.
Upvotes: 3
Reputation: 101
How I eventually handled the problem:
function iter(data::Vector{Metadata}; property::Symbol = :res_id)
#GET UNIQUE VALUES FOR THIS PROPERTY
up = Vector{Any}()
for s in data
getproperty(s, property) in up ? nothing : push!(up, getproperty(s, property))
end
#GROUP ELEMENTS BASED ON THE UNIQUE VALUES FOR THIS PROPERTY
f = Vector{Vector{Metadata}}()
idx::Int64 = 1
cmp::Any = up[idx]
push!(f, Vector{Metadata}())
for s in data
if getproperty(s, property) == cmp
push!(f[idx], s)
else
push!(f, Vector{Metadata}())
idx += 1
cmp = up[idx]
push!(f[idx], s)
end
end
return f
end
This allows me to accommodate "skipped" res_id's (like jumping from 1 to 3, etc) and even group the Metadata objects by other future characteristics other than res_id, such as Strings, or types other than Int64's. Works, although it probably isn't very efficient.
You can then iterate over the Vector{Metadata} this way:
for r in iter(rs)
println(res)
end
Upvotes: 0
Reputation: 42214
From the names of your variables it seems you are collecting the data from some computational process. Normally you use DataFrame
for that purpose.
using DataFrames
data = DataFrame(id=[1,2,3,4,5,6],res_id=[1,1,1,2,2,2])
for group in groupby(data,:res_id)
println(group)
end
This yields:
3×2 SubDataFrame{Array{Int64,1}}
│ Row │ id │ res_id │
│ │ Int64 │ Int64 │
├─────┼───────┼────────┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 1 │
│ 3 │ 3 │ 1 │
3×2 SubDataFrame{Array{Int64,1}}
│ Row │ id │ res_id │
│ │ Int64 │ Int64 │
├─────┼───────┼────────┤
│ 1 │ 4 │ 2 │
│ 2 │ 5 │ 2 │
│ 3 │ 6 │ 2 │
This is also more convenient for further processing of results.
Upvotes: 0
Reputation: 1
Sounds like you need a groupBy
function. Here is an implement for reference, in Haskell
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
Upvotes: -2
Reputation: 10984
You can iterate over a Generator of filters like this:
julia> mutable struct Metadata
id::Int64
res_id::Int64
end
julia> data = [
Metadata(1, 1),
Metadata(2, 1),
Metadata(3, 1),
Metadata(4, 2),
Metadata(5, 2),
Metadata(6, 2),
];
julia> for res in (filter(x -> x.res_id == i, data) for i in 1:2)
println(res)
end
Metadata[Metadata(1, 1), Metadata(2, 1), Metadata(3, 1)]
Metadata[Metadata(4, 2), Metadata(5, 2), Metadata(6, 2)]
Upvotes: 0