José Manuel
José Manuel

Reputation: 101

How to create a custom iterator in Julia 1.0?

I have this structure in Julia 1.0:

mutable struct Metadata
    id::Int64
    res_id::Int64
end

So that I can create an array of these, where the id is always incremented by one, but the res_id is only sometimes incremented, like so:

data = [
    Metadata(1, 1),
    Metadata(2, 1),
    Metadata(3, 1),
    Metadata(4, 2),
    Metadata(5, 2),
    Metadata(6, 2),
...]

What I want to do is be able to iterate over this Array, but get blocks based on the res_id (all the data with res_id 1, then 2, etc). The desired behavior would be something like this:

for res in iter_res(data)
    println(res)
end

julia>
[Metadata(1, 1), Metadata(2, 1), Metadata(3, 1)]
[Metadata(4, 2), Metadata(5, 2), Metadata(6, 2)]

How do I do this in Julia 1.0, considering that I also need to normally iterate over the array to get element by element?

Upvotes: 4

Views: 2986

Answers (5)

BoZenKhaa
BoZenKhaa

Reputation: 941

In Julia 1+, this should be done by implementing Base.iterate(::YourType) to get the starting iteration and Base.iterate(::YourType, state) for other iterations based of some state. These methods should return nothing when done, otherwise, (result, state) tuple.

Iterating on YourType with

for i in x
    # stuff
end

is then a shorthand for writing

it = iterate(x)
while it !== nothing
    i, state = it
    # stuff
    it = iterate(x, state)
end

See the manual for details.

Upvotes: 3

José Manuel
José Manuel

Reputation: 101

How I eventually handled the problem:

function iter(data::Vector{Metadata}; property::Symbol = :res_id)

    #GET UNIQUE VALUES FOR THIS PROPERTY
    up = Vector{Any}()
    for s in data
        getproperty(s, property) in up ? nothing : push!(up, getproperty(s, property))
    end

    #GROUP ELEMENTS BASED ON THE UNIQUE VALUES FOR THIS PROPERTY
    f = Vector{Vector{Metadata}}()
    idx::Int64 = 1
    cmp::Any = up[idx]
    push!(f, Vector{Metadata}())
    for s in data
        if getproperty(s, property) == cmp
            push!(f[idx], s)
        else
            push!(f, Vector{Metadata}())
            idx += 1
            cmp = up[idx]
            push!(f[idx], s)
        end
    end
    return f
end

This allows me to accommodate "skipped" res_id's (like jumping from 1 to 3, etc) and even group the Metadata objects by other future characteristics other than res_id, such as Strings, or types other than Int64's. Works, although it probably isn't very efficient.

You can then iterate over the Vector{Metadata} this way:

for r in iter(rs)
    println(res)
end

Upvotes: 0

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42214

From the names of your variables it seems you are collecting the data from some computational process. Normally you use DataFrame for that purpose.

using DataFrames
data = DataFrame(id=[1,2,3,4,5,6],res_id=[1,1,1,2,2,2])
for group in groupby(data,:res_id)
    println(group)
end

This yields:

3×2 SubDataFrame{Array{Int64,1}}
│ Row │ id    │ res_id │
│     │ Int64 │ Int64  │
├─────┼───────┼────────┤
│ 1   │ 1     │ 1      │
│ 2   │ 2     │ 1      │
│ 3   │ 3     │ 1      │
3×2 SubDataFrame{Array{Int64,1}}
│ Row │ id    │ res_id │
│     │ Int64 │ Int64  │
├─────┼───────┼────────┤
│ 1   │ 4     │ 2      │
│ 2   │ 5     │ 2      │
│ 3   │ 6     │ 2      │

This is also more convenient for further processing of results.

Upvotes: 0

Kai Liu
Kai Liu

Reputation: 1

Sounds like you need a groupBy function. Here is an implement for reference, in Haskell

groupBy                 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []           =  []
groupBy eq (x:xs)       =  (x:ys) : groupBy eq zs
                           where (ys,zs) = span (eq x) xs

Upvotes: -2

fredrikekre
fredrikekre

Reputation: 10984

You can iterate over a Generator of filters like this:

julia> mutable struct Metadata
           id::Int64
           res_id::Int64
       end

julia> data = [
           Metadata(1, 1),
           Metadata(2, 1),
           Metadata(3, 1),
           Metadata(4, 2),
           Metadata(5, 2),
           Metadata(6, 2),
       ];

julia> for res in (filter(x -> x.res_id == i, data) for i in 1:2)
           println(res)
       end
Metadata[Metadata(1, 1), Metadata(2, 1), Metadata(3, 1)]
Metadata[Metadata(4, 2), Metadata(5, 2), Metadata(6, 2)]

Upvotes: 0

Related Questions