Luca
Luca

Reputation: 10996

dropping singleton dimensions in julia

Just playing around with Julia (1.0) and one thing that I need to use a lot in Python/numpy/matlab is the squeeze function to drop the singleton dimensions.

I found out that one way to do this in Julia is:

a = rand(3, 3, 1);
a = dropdims(a, dims = tuple(findall(size(a) .== 1)...))

The second line seems a bit cumbersome and not easy to read and parse instantly (this could also be my bias that I bring from other languages). However, I wonder if this is the canonical way to do this in Julia?

Upvotes: 14

Views: 9696

Answers (4)

tholy
tholy

Reputation: 12179

Let me simply add that "uncontrolled" dropdims (drop any singleton dimension) is a frequent source of bugs. For example, suppose you have some loop that asks for a data array A from some external source, and you run R = sum(A, dims=2) on it and then get rid of all singleton dimensions. But then suppose that one time out of 10000, your external source returns A for which size(A, 1) happens to be 1: boom, suddenly you're dropping more dimensions than you intended and perhaps at risk for grossly misinterpreting your data.

If you specify those dimensions manually instead (e.g., dropdims(R, dims=2)) then you are immune from bugs like these.

Upvotes: 10

Tasos Papastylianou
Tasos Papastylianou

Reputation: 22245

I'm a bit surprised at Colin's revelation; surely something relying on 'reshape' is type stable? (plus, as a bonus, returns a view rather than a copy).

julia> function squeeze( A :: AbstractArray )
         keepdims = Tuple(i for i in size(A) if i != 1);
         return reshape( A, keepdims );
       end;

julia> a = randn(2,1,3,1,4,1,5,1,6,1,7);

julia> size( squeeze(a) )
(2, 3, 4, 5, 6, 7)

No?

Upvotes: 0

Colin T Bowers
Colin T Bowers

Reputation: 18560

The actual answer to this question surprised me. What you are asking could be rephrased as:

why doesn't dropdims(a) remove all singleton dimensions?

I'm going to quote Tim Holy from the relevant issue here:

it's not possible to have squeeze(A) return a type that the compiler can infer---the sizes of the input matrix are a runtime variable, so there's no way for the compiler to know how many dimensions the output will have. So it can't possibly give you the type stability you seek.

Type stability aside, there are also some other surprising implications of what you have written. For example, note that:

julia> f(a) = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
f (generic function with 1 method)

julia> f(rand(1,1,1))
0-dimensional Array{Float64,0}:
0.9939103383167442

In summary, including such a method in Base Julia would encourage users to use it, resulting in potentially type-unstable code that, under some circumstances, will not be fast (something the core developers are strenuously trying to avoid). In languages like Python, rigorous type-stability is not enforced, and so you will find such functions.

Of course, nothing stops you from defining your own method as you have. And I don't think you'll find a significantly simpler way of writing it. For example, the proposition for Base that was not implemented was the method:

function squeeze(A::AbstractArray)
    singleton_dims = tuple((d for d in 1:ndims(A) if size(A, d) == 1)...)
    return squeeze(A, singleton_dims)
end

Just be aware of the potential implications of using it.

Upvotes: 20

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42234

You can get rid of tuple in favor of a comma ,:

dropdims(a, dims = (findall(size(a) .== 1)...,))

Upvotes: 6

Related Questions