Reputation:
This seems basic enough that I expected someone already asked this, but I couldn't find it.
When I use broadcasting the naive way I'm getting an array of arrays when I would like to get a two-dimensional array. For example, this function
function onehotencode(n, domain_size)
return [ n == k ? 1 : 0 for k in 1:domain_size ]
end
When I run
onehotencode.([1,2,3,4], 10)
I get
4-element Array{Array{Int64,1},1}:
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Instead, I would like to get
4x10 Array{Int64,2}:
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
Upvotes: 6
Views: 517
Reputation: 69869
Your function returns vectors so they are collected as a vector of vectors. Either write:
permutedims(reduce(hcat, onehotencode.([1,2,3,4], 10)))
which reuses your code and gets you what you want (but is not very efficient), or simply write:
.==([1,2,3,4], (1:10)')
or
.==([1,2,3,4], hcat(1:10...))
If you want to get an Int
(not Bool
) then write Int.(.==([1,2,3,4], hcat(1:10...)))
.
==
can be replaced by any function of your choice that works on scalars, for example:
julia> f(x,y) = (x,y)
f (generic function with 1 method)
julia> f.([1,2,3,4], hcat(1:10...))
4×10 Array{Tuple{Int64,Int64},2}:
(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) (1, 9) (1, 10)
(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10)
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10)
In general a rule, that I find useful in practice in Julia is to write functions that work on scalars and then use broadcasting or other higher-order components of the language to work on them.
EDIT
Your function takes scalars, but actually expands them internally and returns a Vector
. So conceptually your function is something like:
function onehotencode(n, domain_range)
return [ n == k ? 1 : 0 for k in domain_range]
end
although it is hidden because you pass a scalar. Therefore you are allowed to write onehotencode.([1,2,3,4], hcat(1:10...))
with your onehotencode
implementation but the return value is treated as an entry in a cell of the resulting Matrix
(and this is clearly not what you want).
If you define your function as:
function onehotencode(n, v)
return n == v ? 1 : 0
end
i.e. taking scalars and returning a scalar (or more precisely returning a "single entry" in the expected resulting Matrix
, as technically it does not have to be a scalar) then all works as expected:
julia> onehotencode.([1,2,3,4], hcat(1:10...))
4×10 Array{Int64,2}:
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
So in summary the function should: get scalars as arguments and return a scalar (and again the word scalar is a simplification - both in arguments and return value these can be anything that is considered as a single entry - simply scalars in both cases are a most common use case).
Upvotes: 4