ccl
ccl

Reputation: 2378

Using Flux with structs: unexpected channel dimension

I seem to be getting different behaviour (different output dimensions) when calling Flux functions in a struct, versus directly applying the functions to a tensor:

Direct application:

m = Chain(MaxPool((2,2), stride=2),Conv((3,3), 32*8=>32*16, pad=1), BatchNorm(32*16, relu),Conv((3,3), 32*16=>32*16, pad=1), BatchNorm(32*16, relu))
println(size(m(ones((32, 32, 256, 1))))) #gives the expected (16, 16, 512, 1)

Via a struct:

block(in_channels, features) = Chain(MaxPool((2,2), stride=2), Conv((3,3), in_channels=>features, pad=1), BatchNorm(features, relu), Conv((3,3), features=>features, pad=1), BatchNorm(features, relu))

struct test
    b
end

function test()
    b = (block(32*8, 32*16))
    test(b)
end

function (t::test)(x)
    x1 = t.b[1](x)
    println(size(x1)) 
end

test1 = test()
test1(ones((32, 32, 256, 1))) #gives (16, 16, 256, 1)

Why is the output channel dimensions different for the 2 snippets? What am I missing about structs in Julia? Thanks!

Upvotes: 1

Views: 121

Answers (2)

Avik Pal
Avik Pal

Reputation: 36

The correct way to define the (t::test)(x) function would be

function (t::test)(x)
    x1 = t.b(x)  # Note the absence of [1]
    println(size(x1)) 
end

t.b[1] would give the first layer in Chain, i.e., MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)) and as such your input is never passed through the Conv Layers.

Upvotes: 2

ccl
ccl

Reputation: 2378

Found the mistake, it's related to indexing instead of use of structs. I declared b as b = (block(32*8, 32*16)), but by indexing b[1], I am actually only calling the first operation (MaxPool) in the Flux chain, which accounts for the difference in channel dimension. What I should do instead is b = block(32*8, 32*16) and x1 = t.b(x) to run all functions in the chain.

Upvotes: 0

Related Questions