Reputation: 404
Given a function/expression that yields a single column, how to build a matrix from those columns in Julia? I tried the following (simplified example):
column(j) = [j, j, j] # for example
my_matrix = Float64[column(j)[i] for i in 1:3, j in 1:4]
==> 3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
The result is correct, but this is not what I want because the column-vector-expression is evaluated unnecessarily (once for every row).
I also tried this alternative approach:
[column(j) for j in 1:4]
==> 4-element Array{Array{Float64,1},1}:
[1.0,1.0,1.0]
[2.0,2.0,2.0]
[3.0,3.0,3.0]
[4.0,4.0,4.0]
But I found no way to convert or reshape this into the form I want (matrix with two dimensions instead of array of arrays).
How to achieve this in Julia?
Upvotes: 5
Views: 11062
Reputation: 39717
Since Julia 1.9.0 stack
can be used to combine a collection of arrays (or other iterable objects) of equal size into one larger array, by arranging them along one or more new dimensions so it can be used to construct a matrix from row/column vectors (returned from function/expression). For versions before 1.9.0 it can be activated with using Compat
.
column(j) = [j, j, j]
stack(column(j) for j=1:4)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
What is fast and memory efficient.
using BenchmarkTools
column(j) = [j, j, j]
@btime stack(column(j) for j=1:4)
# 193.315 ns (5 allocations: 480 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime begin #@Benoît Legat
m = Matrix{Int}(undef, 3, 4)
for j in 1:4
m[:, j] = column(j)
end
m
end
# 201.942 ns (5 allocations: 480 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime reduce(hcat, [column(j) for j=1:4]) #@Przemyslaw Szufel
# 249.676 ns (6 allocations: 560 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime reshape(collect(Iterators.flatten([column(j) for j in 1:4])), 3, 4) #@Benoît Legat
# 392.325 ns (10 allocations: 976 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime Int[column(j)[i] for i in 1:3, j in 1:4] #@radioflash
# 440.211 ns (13 allocations: 1.09 KiB)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime hcat([column(j) for j=1:4]...) #@spencerlyon2
# 644.676 ns (10 allocations: 784 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
In case you have already a vector.
c = [[j,j,j] for j in 1:4]
#4-element Vector{Vector{Int64}}:
# [1, 1, 1]
# [2, 2, 2]
# [3, 3, 3]
# [4, 4, 4]
stack(c)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
using BenchmarkTools
@btime stack($c)
# 63.204 ns (1 allocation: 160 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime reduce(hcat, $c) #@Przemyslaw Szufel
# 68.758 ns (1 allocation: 160 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime begin #@Benoît Legat
m = Matrix{Int}(undef, 3, 4)
for j in 1:4
m[:, j] = $c[j]
end
m
end
# 75.586 ns (1 allocation: 160 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime reshape(collect(Iterators.flatten($c)), 3, 4) #@Benoît Legat
# 210.752 ns (5 allocations: 576 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
@btime hcat($c...) #@spencerlyon2
# 453.213 ns (5 allocations: 384 bytes)
#3×4 Matrix{Int64}:
# 1 2 3 4
# 1 2 3 4
# 1 2 3 4
Upvotes: 4
Reputation: 968
But I found no way to convert or reshape this into the form I want (matrix with two dimensions instead of array of arrays).
You can obtain the matrix by reshaping the flatten'ed version of the vector of columns c
:
julia> column(j) = [j, j, j]
column (generic function with 1 method)
julia> c = [column(j) for j in 1:4]
4-element Vector{Vector{Int64}}:
[1, 1, 1]
[2, 2, 2]
[3, 3, 3]
[4, 4, 4]
julia> v = collect(Iterators.flatten(c))
12-element Vector{Int64}:
1
1
1
2
2
2
3
3
3
4
4
4
julia> m = reshape(v, 3, 4)
3×4 Matrix{Int64}:
1 2 3 4
1 2 3 4
1 2 3 4
The alternative hcat(c...)
suggested by the accepted answer works as well but as pointed out by a comment, this will not work for a large number of columns.
Indeed, if c
is a vector of n
columns then hcat(c...)
will compile a method with n
arguments. This first means that if your code does this for different number of columns then you will need to compile a new method every time. Secondly, compiling a method with too many arguments will make the stack memory overflow.
For this reason, when you don't need a one-line solution (e.g. like the one combining flatten and reshape above), it seems to me that most Julia users do:
m = Matrix{Int}(undef, 3, 4)
for j in 1:4
m[:, j] = column(j)
end
EDIT @Przemyslaw suggested a better solution: https://stackoverflow.com/a/68976092/1189815
Upvotes: 4
Reputation: 42264
Looking at answer @Benoit it is actually faster to use reduce(hcat, c)
:
julia> c = [[j,j,j] for j in 1:4]
4-element Vector{Vector{Int64}}:
[1, 1, 1]
[2, 2, 2]
[3, 3, 3]
[4, 4, 4]
julia> @btime reshape(collect(Iterators.flatten($c)), 3, 4)
232.791 ns (6 allocations: 448 bytes)
3×4 Matrix{Int64}:
1 2 3 4
1 2 3 4
1 2 3 4
julia> @btime reduce(hcat, $c)
70.825 ns (1 allocation: 176 bytes)
3×4 Matrix{Int64}:
1 2 3 4
1 2 3 4
1 2 3 4
Upvotes: 4
Reputation: 9686
Have you tried hcat
?:
julia> column(j) = [j, j ,j]
column (generic function with 1 method)
julia> my_matrix = hcat([column(j) for j=1:4]...)
3x4 Array{Int64,2}:
1 2 3 4
1 2 3 4
1 2 3 4
Upvotes: 9