Reputation: 345
Edited for Clarity!
There are a couple of ways to build/generate an array in Julia.
I have been using the single quote or apostrophe approach for column vectors because it is quicker than multiple commas within the []'s:
julia> a = [1 2 3 4]'
4×1 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
1
2
3
4
This is generating what I believe to be a more complicated data type: "LinearAlgebra.Adjoint{Int64,Array{Int64,1}}"
In comparison to comma separated elements:
julia> a = [1,2,3,4]
4-element Array{Int64,1}:
1
2
3
4
Which generates an Array{Int64,1} type.
The Question(s):
Is the LinearAlgebra.Adjoint{...} type more computationally expensive then the base array? Should I avoid generating this array in a general sense?(i.e. outside modeling linear algebra)
It's possible there is a small difference that wouldn't matter on a smaller scope but, I plan to eventually preform operations on large data sets. Should I try to keep consistent with generating them as Array{Int64,1} types for these purposes?
Original
I've been learning Julia and I would like to develop good habits early; focusing on computational efficiency. I have been working with arrays and gotten comfortable with the single quote notation at the end to convert into a column vector. From what I'm understanding about the type system, this isn't just a quicker version than commas way to organize. Is using the comma computationally more expensive or semantically undesirable in general? It seems it wouldn't matter with smaller data sets but, what about larger data sets?(e.g. 10k computations)
Deleted original code example to avoid confusion.
Upvotes: 0
Views: 157
Reputation: 12664
Here's a performance example:
julia> a = rand(10^6);
julia> b = rand(1, 10^6)';
julia> typeof(a)
Array{Float64,1}
julia> typeof(b)
Adjoint{Float64,Array{Float64,2}}
julia> @btime sum($a)
270.137 μs (0 allocations: 0 bytes)
500428.44363296847
julia> @btime sum($b)
1.710 ms (0 allocations: 0 bytes)
500254.2267732659
As you can see, the performance of the sum over the Vector
is much better than the sum over the Adjoint
(I'm actually a bit surprised about how big the difference is).
But to me the bigger reason to use Vector
is that it just seems weird and unnatural to use the complicated and convoluted Adjoint
type. It is also a much bigger risk that some code will not accept an Adjoint
, and then you've just made extra trouble for yourself.
But, really, why do you want to use the Adjoint
? Is it just to avoid writing in commas? How long are these vectors you are typing in? If vector-typing is a very big nuisance to you, you could consider writing [1 2 3 4][:]
which will return a Vector
. It will also trigger an extra allocation and copy, and it looks strange, but if it's a very big deal to you, maybe it's worth it.
My advice: type the commas.
Upvotes: 1