Reputation: 8352
Say I have a vector of strings like this
julia> R = ["ABC","DEF"]
2-element Vector{String}:
"ABC"
"DEF"
Now I duplicate the elements to form a 2*2 matrix:
julia> x = [R R]
2×2 Matrix{String}:
"ABC" "ABC"
"DEF" "DEF"
What I want to achieve is to concatenate the strings from each row of the matrix. The best I could come up with is
julia> [join(x[i,:]) for i in 1:length(x)÷2]
2-element Vector{String}:
"ABCABC"
"DEFDEF"
which gives the desired result.
Are there alternative solutions (without an explicit loop)? I tried to find a working syntax with broadcasting but failed.
(Another idea I tried was
julia> x = [R,R]
2-element Vector{Vector{String}}:
["ABC", "DEF"]
["ABC", "DEF"]
julia> join.(x)
2-element Vector{String}:
"ABCDEF"
"ABCDEF"
which is "simpler" but obviously does not give the desired result.)
Upvotes: 4
Views: 961
Reputation: 941
I wound up with a few options for applying function to the rows of the matrix:
julia> x = ["ABC" "ABC"; "DEF" "DEF"]
2×2 Matrix{String}:
"ABC" "ABC"
"DEF" "DEF"
mapslices
The function you are looking for might be mapslices(f, A; dims)
:
julia> mapslices(join, x; dims=[2])
2×1 Matrix{String}:
"ABCABC"
"DEFDEF"
It's a "map" called on "slices" of the array instead of the elements, along the dimensions given by dims
.
eachrow
eachrow(A::AbstractVecOrMat)
creates an iterator over the rows of the matrix, returning array view for each.
julia> join.(eachrow(x))
2-element Vector{String}:
"ABCABC"
"DEFDEF"
map
and eachrow
julia> map(join, eachrow(x))
2-element Vector{String}:
"ABCABC"
"DEFDEF"
The performance of the three approaches appears to be identical on 100x100 random array using BenchmarkTools:
method | performance |
---|---|
1. @btime mapslices(join, x; dims=[2]) |
1.379 ms (21935 allocations: 4.92 MiB) |
2. @btime join.(eachrow(x)) |
1.296 ms (21206 allocations: 4.82 MiB) |
3. @btime map(join, eachrow(x)) |
1.294 ms (21304 allocations: 4.82 MiB) |
(with small overhead for more flexible mapslices)
Upvotes: 4
Reputation: 13800
Seeing as you say in the comments that you are starting from R
, there's no point to form x
to get what you're after, just repeat the elements of R
directly:
julia> repeat.(R, 2) == join.(eachrow([R R]))
true
julia> @btime repeat.($R, 2);
61.283 ns (3 allocations: 128 bytes)
julia> @btime join.(eachrow([$R $R]));
354.392 ns (11 allocations: 704 bytes)
in this case it allocates roughly a quarter and is 5 times faster.
EDITed to add a benchmark that's closer to the other answer - a 100-element vector of lenght 3 random strings:
julia> using Random
julia> R = [randstring(3) for _ ∈ 1:100];
julia> @btime join.(eachrow([$R for _ ∈ 1:100]));
1.607 ms (1103 allocations: 138.62 KiB)
julia> @btime repeat.($R, 100);
43.497 μs (101 allocations: 33.69 KiB)
so here we have more like a factor of 40 difference in timings (although it's less obvious what to benchmark now because in addition to benchmarking the concatenation of the different rows of x
one could think of different ways to construct x
from R
with different levels of efficiency)
Upvotes: 3