Reputation: 1322
Is there a neat function in julia which will merge two sorted arrays and return the sorted array for me? I have written:
c=1
p=1
i=1
n=length(tc)+length(tp)
t=Array{Float64}(n)
while(c<=length(tc) && p<=length(tp))
if(tp[p]<tc[c])
t[i]=tp[p]
p=p+1;
i=i+1;
else
t[i]=tc[c]
c=c+1;
i=i+1;
end
end
while(p<=length(tp))
t[i]=tp[p]
i=i+1
p=p+1
end
while(c<=length(tc))
t[i]=tc[c]
i=i+1
c=c+1
end
but is there no native function in base julia to do this?
Upvotes: 6
Views: 1699
Reputation: 18560
Contrary to the other answers, there is in fact a method to do this in base Julia. BUT, it only works for arrays of integers, AND it will only work if the arrays are unique (in the sense that no integer is repeated in either array). Simply use the IntSet
type as follows:
a = [2, 3, 4, 8]
b = [1, 5]
union(IntSet(a), IntSet(b))
If you run the above code, you'll note that the union
function removes duplicates from the output, which is why I stated initially that your arrays must be unique (or else you must be happy to have duplicates removed in the output). You'll also notice that the union
operation on the IntSet
works much faster than union
on a sorted Vector{Int}
, since the former exploits the fact that an IntSet
is pre-sorted.
Of course, the above is not really in the spirit of the question, which more concerns a solution for any type for which the lt
operator is defined, as well as allowing for duplicates.
Here is a function that efficiently finds the union of two pre-sorted unique vectors. I've never had a need for the non-unique case myself so have not written a function that covers that case I'm afraid:
"union <- Return the union of the inputs as a new sorted vector"
function union_vec(x::Vector{T}, y::Vector{T})::Vector{T} where {T}
(nx, ny) = (1, 1)
z = T[]
while nx <= length(x) && ny <= length(y)
if x[nx] < y[ny]
push!(z, x[nx])
nx += 1
elseif y[ny] < x[nx]
push!(z, y[ny])
ny += 1
else
push!(z, x[nx])
nx += 1
ny += 1
end
end
if nx <= length(x)
[ push!(z, x[n]) for n = nx:length(x) ]
elseif ny <= length(y)
[ push!(z, y[n]) for n = ny:length(y) ]
end
return z
end
Another option is to look at sorted dictionaries, available in the DataStructures.jl package. I haven't done it myself, but a method that just inserts all observations into a sorted dictionary (checking for key duplication as you go) and then iterates over (keys, values)
should also be a fairly efficient way to attack this problem.
Upvotes: 4
Reputation: 106
I keep coming across this in different projects, so I made a package MergeSorted (https://github.com/vvjn/MergeSorted.jl). You can use it as follows.
using MergeSorted
a = sort!(rand(1000))
b = sort!(rand(1000))
c = mergesorted(a,b)
sort!(vcat(a,b)) == c
Or without allocating new memory.
mergesorted!(c, a, b)
You can also use all of the sort
options.
a = sort!(rand(1000), order=Base.Reverse)
b = sort!(rand(1000), order=Base.Reverse)
c = mergesorted(a,b, order=Base.Reverse)
sort!(vcat(a,b), order=Base.Reverse) == c
It's around 4-6 times faster than sort!(vcat(a,b))
, which uses QuickSort
by default, and twice as fast as sort!(vcat(a,b), alg=MergeSort)
but MergeSort
uses more memory.
Upvotes: 3
Reputation: 18217
Although an explicit function to merge two sorted vectors seems to be missing, one can be constructed easily from the existing building blocks (the question actually demonstrated this, but it doesn't define a function).
The following method tries to leverage the existing sort
code and still remain efficient.
In code:
mergesorted(a,b) = sort!(vcat(a,b))
The following is an example:
julia> a = [1:2:11...];
julia> b = [2:3:20...];
julia> show(a)
[1,3,5,7,9,11]
julia> show(b)
[2,5,8,11,14,17,20]
julia> show(mergesorted(a,b))
[1,2,3,5,5,7,8,9,11,11,14,17,20]
I didn't benchmark the function, but QuickSort (the default sort algorithm) is usually good performing on pre-sorted arrays, so it should be OK and the allocation of a result vector is required in any implementation.
Upvotes: 3
Reputation: 222581
No, such function does not exist. And actually I have not seen a language which has such function out of the box.
To do this, you have to maintain two pointers in each of the arrays, compare the values and move the smaller (based on what I see, this is exactly what you do).
Upvotes: 2