Prokop Hapala
Prokop Hapala

Reputation: 2444

How convert between different kinds of multidimensional arrays in Julia

I'm migrating from python/numpy to julia. I'm really confused by Julia's multidimensional arrays and it feels like there is some additional level of complexity / hassle (in comparison to numpy).

There is distinction between 1)row-vectros 2)column-vectors, 3)multidimensional arrays and 4)nested arrays (=Arrays of arrays). That would be all fine (perhaps useful for performance optimization), assuming there is simple way how to convert between them. But I cannot figure out how to do it.

Simple example: I just try to generate 2D rectangular grid of points and plot them


ps = [ [ix*0.1 iy*0.1] for ix=1:10, iy=1:10 ]
# 10×10 Array{Array{Float64,2},2}:
# Oh, this is nested array? I wand just simple 3D array 10x10x2

scatter( ps[:,:,1], ps[:,:,2], markersize = 2, markerstrokewidth = 0, aspect_ratio=:equal )
# ERROR: BoundsError: attempt to access 10×10 Array{Array{Float64,2},2} at index [Base.Slice(Base.OneTo(10)), Base.Slice(Base.OneTo(10)), 2]

sh = size(ps)
# (10,10)

ps = reshape( ps, ( sh[1]*sh[2],2) )
# ERROR: DimensionMismatch("new dimensions (100, 2) must be consistent with  array size 100")
# Oh dear :(

ps = reshape( ps, ( sh[1]*sh[2],:) )
# 100×1 Array{Array{Float64,2},2}

xs = ps[:,1]
# 100-element Array{Array{Float64,2},1}
# ??? WTF? ... this arrays looks like whole 'ps' 
ys = ps[:,2] 
# ERROR: BoundsError: attempt to access 100×1 Array{Array{Float64,2},2} at index [Base.Slice(Base.OneTo(100)), 2]

xs = ps[:][1]
# 1×2 Array{Float64,2}: 
#  0.1  0.1
#  But I want all xs  (ps[:,1]), not (ps[1,:]) 

# Let's try some broadcasting
xs = ps.[1]
# ERROR: syntax: invalid syntax "ps.[1]"
xs = .ps[1]
# ERROR: syntax: invalid identifier name "."


# Perhaps transpose will help?
ps_ = ps'   #' stackoverflow syntax highlighting for Julia is broken ?
# 1×100 LinearAlgebra.Adjoint{LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Array{Array{Float64,2},2}}:
# OMG! ... That is even worse

scatter( ps[:,1], ps[:,2], markersize = 2, markerstrokewidth = 0, aspect_ratio=:equal )
# Nope

OK this somehow works. But still I need to figure out how to convert between the different shapes of arrays above

using Plots
ps = [ [ix*0.1 iy*0.1] for ix=1:10, iy=1:10 ]
ps = vcat(ps...)
xs = ps[:,1]
ys = ps[:,2]
scatter( xs, ys, markersize = 2, markerstrokewidth = 0, aspect_ratio=:equal )

EDIT:

Maybe it could be good to list some tutorials where I was searching the answer before I asked:

Upvotes: 4

Views: 3454

Answers (3)

DNF
DNF

Reputation: 12654

Others have have given replies that help solve your problem, so I'd rather just go through your example and try to explain what is going on. You seem pretty frustrated, which is not uncommon when learning a new language, but most of your complaints seem to be because Julia is being consistent.

ps = [ [ix*0.1 iy*0.1] for ix=1:10, iy=1:10 ]
# 10×10 Array{Array{Float64,2},2}:
# Oh, this is nested array? I wand just simple 3D array 10x10x2

Here you are creating a 10x10 array, where each element is a 1x2 matrix. That's what you are asking for, it's not Julia being difficult or obscure, it's just being consistent and straightforward.

scatter( ps[:,:,1], ps[:,:,2], markersize = 2, markerstrokewidth = 0, aspect_ratio=:equal )
# ERROR: BoundsError: attempt to access 10×10 Array{Array{Float64,2},2} at index [Base.Slice(Base.OneTo(10)), Base.Slice(Base.OneTo(10)), 2]

You have a 2D array, so you cannot index into it with 3 indices.

sh = size(ps)
# (10,10)

ps = reshape( ps, ( sh[1]*sh[2],2) )
# ERROR: DimensionMismatch("new dimensions (100, 2) must be consistent with  array size 100")
# Oh dear :(

You have a 10x10 array and try to reshape it into a 100x2 array. The new array would have 200 elements, which is twice as much as the original, so this cannot work.

ps = reshape( ps, ( sh[1]*sh[2],:) )
# 100×1 Array{Array{Float64,2},2}

Here you reshape it into a 100x1 array, that's fine.

xs = ps[:,1]
# 100-element Array{Array{Float64,2},1}
# ??? WTF? ... this arrays looks like whole 'ps' 

And now you are asking for the first (and only) column of the new, reshaped ps. So, naturally, you get all the data. Notice that xs is now a 1D array, not a 100x1 2D array.

ys = ps[:,2] 
# ERROR: BoundsError: attempt to access 100×1 Array{Array{Float64,2},2} at index [Base.Slice(Base.OneTo(100)), 2]

You are asking for the second column of a 100x1 array.

xs = ps[:][1]
# 1×2 Array{Float64,2}: 
#  0.1  0.1
#  But I want all xs  (ps[:,1]), not (ps[1,:]) 

ps[:] turns ps into 1D vector with 100 elements, and then you ask for the first element of that. Seems to me to be expected behaviour.

# Let's try some broadcasting
xs = ps.[1]
# ERROR: syntax: invalid syntax "ps.[1]"

Yes, this doesn't work, and it's not unreasonable to expect that it might, but this is a possible future feature. Perhaps you are looking for first.(ps), which reads the first element from each element of ps. Similarly, last.(ps) reads the last element from each element of ps.

xs = .ps[1]
# ERROR: syntax: invalid identifier name "."

This is not valid syntax. The dot syntax only works on functions and operators.

# Perhaps transpose will help?
ps_ = ps'   #' stackoverflow syntax highlighting for Julia is broken ?
# 1×100 LinearAlgebra.Adjoint{LinearAlgebra.Adjoint{Float64,Array{Float64,2}},Array{Array{Float64,2},2}}:
# OMG! ... That is even worse

Not sure what you want to happen here. Transpose returns a lazy datatype for performance reasons. It's pretty neat.

scatter( ps[:,1], ps[:,2], markersize = 2, markerstrokewidth = 0, aspect_ratio=:equal )
# Nope

As far as I recall, you have changed ps into a 100x1 array, so ps[:,2] cannot work.

Upvotes: 0

Jeffrey Sarnoff
Jeffrey Sarnoff

Reputation: 1757

Julia works in column-major fashion, so the basic vector is a column vector. To convert a column vector to a row vector, use permutedims(colvec). To convert a row vector to a column vector, use permutedims(rowvec).

julia> colvec = [1, 2, 3]
3-element Array{Int64,1}:
 1
 2
 3

julia> rowvec = permutedims(colvec)
1×3 Array{Int64,2}:
 1  2  3

julia> permutedims(rowvec)
3×1 Array{Int64,2}:
 1
 2
 3

To convert a matrix (or any 2-dimensional array) to a column vector, use vec. Because Julia stores 2-dimensional arrays by column, this will traverse each column in turn. Note that 2D array dimensions are shown as <rows>x<cols> Array{<type>,2}.

julia> matrix = [1 4 
                 2 5 
                 3 6]
3×2 Array{Int64,2}:
 1  4
 2  5
 3  6  

julia> colvec = vec(matrix)
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

To convert that colvec back to the original array requires knowing the dimensions of that that original array. ndims(x) counts the dimensions of x and size(x) gives the number of elements in each dimension of x.

julia> reshape(colvec, size(matrix))
3×2 Array{Int64,2}:
 1  4
 2  5
 3  6

You can transpose the entries with permutedims(matrix).

julia> matrix
3×2 Array{Int64,2}:
 1  4
 2  5
 3  6

julia> permutedims(matrix)
2×3 Array{Int64,2}:
 1  2  3
 4  5  6

The same principles apply to higher dimensional arrays.

array = reshape(collect(1:12), (3, 2, 2))
3×2×2 Array{Int64,3}:
[:, :, 1] =
 1  4
 2  5
 3  6

[:, :, 2] =
 7  10
 8  11
 9  12

julia> vec(array)
12-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12

For working with nested arrays, I suggest using RecursiveArrayTools.jl.

Upvotes: 5

Nils Gudat
Nils Gudat

Reputation: 13800

There's quite a lot in the code above, but just focusing on your point of departure and the intended outcome:

Why nested array?

Your comprehension creates the array [ix*0.1 iy*0.1] for every combination of ix and iy, so I would argue you explicitly asked for it.

There are probably some whizzy ways to either do this with a fancy comprehension or somehow flatten the nested array, but in cases like this one I like to be explicit about what I'm trying to achieve:

ps = zeros(10,10,2) # 10x10x2 Array{Float64,3}
for ix = 1:10, iy = 1:10
        ps[ix, iy, :] = [ix*0.1 iy*0.1]
end

If it's about having a one-liner you can consider creating both 10x10 arrays in comprehensions and then concatenating those along a third dimension:

ps = cat([ix*0.1 for ix=1:10, iy=1:10], [iy*0.1 for ix=1:10, iy=1:10], dims = 3)

Upvotes: 2

Related Questions