fuenfundachtzig
fuenfundachtzig

Reputation: 8352

Julia: loop over rows of matrix (or not)

Say I have a vector of strings like this

julia> R = ["ABC","DEF"]
2-element Vector{String}:
"ABC"
"DEF"

Now I duplicate the elements to form a 2*2 matrix:

julia> x = [R R]
2×2 Matrix{String}:
"ABC"  "ABC"
"DEF"  "DEF"

What I want to achieve is to concatenate the strings from each row of the matrix. The best I could come up with is

julia> [join(x[i,:]) for i in 1:length(x)÷2]
2-element Vector{String}:
"ABCABC"
"DEFDEF"

which gives the desired result.

Are there alternative solutions (without an explicit loop)? I tried to find a working syntax with broadcasting but failed.

(Another idea I tried was

julia> x = [R,R]
2-element Vector{Vector{String}}:
["ABC", "DEF"]
["ABC", "DEF"]

julia> join.(x)
2-element Vector{String}:
"ABCDEF"
"ABCDEF"

which is "simpler" but obviously does not give the desired result.)

Upvotes: 4

Views: 961

Answers (2)

BoZenKhaa
BoZenKhaa

Reputation: 941

I wound up with a few options for applying function to the rows of the matrix:

julia> x = ["ABC" "ABC"; "DEF" "DEF"]
2×2 Matrix{String}:
 "ABC"  "ABC"
 "DEF"  "DEF"

1. map-like synatax using mapslices

The function you are looking for might be mapslices(f, A; dims):

julia> mapslices(join, x; dims=[2])
2×1 Matrix{String}:
 "ABCABC"
 "DEFDEF"

It's a "map" called on "slices" of the array instead of the elements, along the dimensions given by dims.

2. broadcasting syntax using eachrow

eachrow(A::AbstractVecOrMat) creates an iterator over the rows of the matrix, returning array view for each.

julia> join.(eachrow(x))
2-element Vector{String}:
 "ABCABC"
 "DEFDEF"

3. Combining plain map and eachrow

julia> map(join, eachrow(x))
2-element Vector{String}:
 "ABCABC"
 "DEFDEF"

The performance of the three approaches appears to be identical on 100x100 random array using BenchmarkTools:

method performance
1. @btime mapslices(join, x; dims=[2]) 1.379 ms (21935 allocations: 4.92 MiB)
2. @btime join.(eachrow(x)) 1.296 ms (21206 allocations: 4.82 MiB)
3. @btime map(join, eachrow(x)) 1.294 ms (21304 allocations: 4.82 MiB)

(with small overhead for more flexible mapslices)

Upvotes: 4

Nils Gudat
Nils Gudat

Reputation: 13800

Seeing as you say in the comments that you are starting from R, there's no point to form x to get what you're after, just repeat the elements of R directly:

julia> repeat.(R, 2) == join.(eachrow([R R]))
true

julia> @btime repeat.($R, 2);
  61.283 ns (3 allocations: 128 bytes)

julia> @btime join.(eachrow([$R $R]));
  354.392 ns (11 allocations: 704 bytes)

in this case it allocates roughly a quarter and is 5 times faster.

EDITed to add a benchmark that's closer to the other answer - a 100-element vector of lenght 3 random strings:

julia> using Random

julia> R = [randstring(3) for _ ∈ 1:100];

julia> @btime join.(eachrow([$R for _ ∈ 1:100]));
  1.607 ms (1103 allocations: 138.62 KiB)

julia> @btime repeat.($R, 100);
  43.497 μs (101 allocations: 33.69 KiB)

so here we have more like a factor of 40 difference in timings (although it's less obvious what to benchmark now because in addition to benchmarking the concatenation of the different rows of x one could think of different ways to construct x from R with different levels of efficiency)

Upvotes: 3

Related Questions