José Manuel
José Manuel

Reputation: 101

Write to multiple files simultaneously on Julia

How do I print to multiple files simultaneously in Julia? Is there a cleaner way other than:

for f in [open("file1.txt", "w"), open("file2.txt", "w")]
    write(f, "content")
    close(f)
end

Upvotes: 3

Views: 911

Answers (4)

Jake Ireland
Jake Ireland

Reputation: 652

If you only needed to read the files line by line, you could probably do something like this:

for (line_a, line_b) in zip(eachline("file_a.txt"), eachline("file_b.txt"))
    # do stuff
end

As eachline will return an iterable EachLine, which will have an I/O stream linked to it.

Upvotes: 0

张实唯
张实唯

Reputation: 2862

Just to add a coroutine version that does IO in parallel like the multiple-process one, but also avoids the data duplication and transfer.

julia> using Distributed, Random

julia> global const content = [randstring(10^8) for _ in 1:10];

julia> function swrite()
           for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
swrite (generic function with 1 method)

julia> @time swrite()
  1.339323 seconds (23.68 k allocations: 1.212 MiB)

julia> @time swrite()
  1.876770 seconds (114 allocations: 6.875 KiB)

julia> function awrite()
           @sync for i in 1:10
               @async open("file$(i).txt", "w") do f
                   write(f, "content")
               end
           end
       end
awrite (generic function with 1 method)

julia> @time awrite()
  0.243275 seconds (155.80 k allocations: 7.465 MiB)

julia> @time awrite()
  0.001744 seconds (144 allocations: 14.188 KiB)

julia> addprocs(4)
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> function ppwrite()
           @sync @distributed for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, "content")
               end
           end
       end
ppwrite (generic function with 1 method)

julia> @time ppwrite()
  1.806847 seconds (2.46 M allocations: 123.896 MiB, 1.74% gc time)
Task (done) @0x00007f23fa2a8010

julia> @time ppwrite()
  0.062830 seconds (5.54 k allocations: 289.161 KiB)
Task (done) @0x00007f23f8734010

Upvotes: 0

carstenbauer
carstenbauer

Reputation: 10127

If you really want to write in parallel (using multiple processes) you can do it as follows:

using Distributed
addprocs(4) # using, say, 4 examples

function ppwrite()
    @sync @distributed for i in 1:10
        open("file$(i).txt", "w") do f
            write(f, "content")
        end
    end
end

For comparison, the serial version would be

function swrite()
    for i in 1:10
        open("file$(i).txt", "w") do f
            write(f, "content")
        end
    end
end

On my machine (ssd + quadcore) this leads to a ~70% speedup:

julia> @btime ppwrite();
  3.586 ms (505 allocations: 25.56 KiB)

julia> @btime swrite();
  6.066 ms (90 allocations: 6.41 KiB)

However, be aware that these timings might drastically change for real content, which might have to be transferred to different processes. Also they probably won't scale as IO will typically be the bottleneck at some point.

Update: larger (string) content

julia> using Distributed, Random, BenchmarkTools

julia> addprocs(4);

julia> global const content = [string(rand(1000,1000)) for _ in 1:10];

julia> function ppwrite()
           @sync @distributed for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
ppwrite (generic function with 1 method)

julia> function swrite()
           for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
swrite (generic function with 1 method)

julia> @btime swrite()
  63.024 ms (110 allocations: 6.72 KiB)

julia> @btime ppwrite()
  23.464 ms (509 allocations: 25.63 KiB) # ~ 2.7x speedup

Doing the same thing with string representations of larger 10000x10000 matrices (3 instead of 10) results in

julia> @time swrite()
  7.189072 seconds (23.60 k allocations: 1.208 MiB)

julia> @time swrite()
  7.293704 seconds (37 allocations: 2.172 KiB)

julia> @time ppwrite();
 16.818494 seconds (2.53 M allocations: 127.230 MiB) # > 2x slowdown of first call

julia> @time ppwrite(); # 30%$ slowdown of second call
  9.729389 seconds (556 allocations: 35.453 KiB)

Upvotes: 4

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69839

From your question I assume that you do not mean write in parallel (which probably would not speed up things due to the fact that the operation is probably IO bound).

Your solution has one small problem - it does not guarantee that f is closed if write throws an exception.

Here are three alternative ways to do it making sure the file is closed even on error:

for fname in ["file1.txt", "file2.txt"]
    open(fname, "w") do f
        write(f, "content")
    end
end

for fname in ["file1.txt", "file2.txt"]
    open(f -> write(f, "content"), fname, "w")
end

foreach(fn -> open(f -> write(f, "content"), fn, "w"),
        ["file1.txt", "file2.txt"])

They give the same result so the choice is a matter of taste (you can derive some more variations of similar implementations).

All the methods are based on the following method of open function:

 open(f::Function, args...; kwargs....)

  Apply the function f to the result of open(args...; kwargs...)
  and close the resulting file descriptor upon completion.

Observe that the processing will still be terminated if an exception is actually thrown somewhere (it is only guaranteed that the file descriptor will be closed). In order to ensure that every write operation is actually attempted you can do something like:

for fname in ["file1.txt", "file2.txt"]
    try
        open(fname, "w") do f
            write(f, "content")
        end
    catch ex
        # here decide what should happen on error
        # you might want to investigate the value of ex here
    end
end

See https://docs.julialang.org/en/latest/manual/control-flow/#The-try/catch-statement-1 for the documentation of try/catch.

Upvotes: 6

Related Questions