Reputation: 101
How do I print to multiple files simultaneously in Julia? Is there a cleaner way other than:
for f in [open("file1.txt", "w"), open("file2.txt", "w")]
write(f, "content")
close(f)
end
Upvotes: 3
Views: 911
Reputation: 652
If you only needed to read the files line by line, you could probably do something like this:
for (line_a, line_b) in zip(eachline("file_a.txt"), eachline("file_b.txt"))
# do stuff
end
As eachline
will return an iterable EachLine
, which will have an I/O stream linked to it.
Upvotes: 0
Reputation: 2862
Just to add a coroutine version that does IO in parallel like the multiple-process one, but also avoids the data duplication and transfer.
julia> using Distributed, Random
julia> global const content = [randstring(10^8) for _ in 1:10];
julia> function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
swrite (generic function with 1 method)
julia> @time swrite()
1.339323 seconds (23.68 k allocations: 1.212 MiB)
julia> @time swrite()
1.876770 seconds (114 allocations: 6.875 KiB)
julia> function awrite()
@sync for i in 1:10
@async open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
awrite (generic function with 1 method)
julia> @time awrite()
0.243275 seconds (155.80 k allocations: 7.465 MiB)
julia> @time awrite()
0.001744 seconds (144 allocations: 14.188 KiB)
julia> addprocs(4)
4-element Array{Int64,1}:
2
3
4
5
julia> function ppwrite()
@sync @distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
ppwrite (generic function with 1 method)
julia> @time ppwrite()
1.806847 seconds (2.46 M allocations: 123.896 MiB, 1.74% gc time)
Task (done) @0x00007f23fa2a8010
julia> @time ppwrite()
0.062830 seconds (5.54 k allocations: 289.161 KiB)
Task (done) @0x00007f23f8734010
Upvotes: 0
Reputation: 10127
If you really want to write in parallel (using multiple processes) you can do it as follows:
using Distributed
addprocs(4) # using, say, 4 examples
function ppwrite()
@sync @distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
For comparison, the serial version would be
function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, "content")
end
end
end
On my machine (ssd + quadcore) this leads to a ~70% speedup:
julia> @btime ppwrite();
3.586 ms (505 allocations: 25.56 KiB)
julia> @btime swrite();
6.066 ms (90 allocations: 6.41 KiB)
However, be aware that these timings might drastically change for real content, which might have to be transferred to different processes. Also they probably won't scale as IO will typically be the bottleneck at some point.
Update: larger (string) content
julia> using Distributed, Random, BenchmarkTools
julia> addprocs(4);
julia> global const content = [string(rand(1000,1000)) for _ in 1:10];
julia> function ppwrite()
@sync @distributed for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
ppwrite (generic function with 1 method)
julia> function swrite()
for i in 1:10
open("file$(i).txt", "w") do f
write(f, content[i])
end
end
end
swrite (generic function with 1 method)
julia> @btime swrite()
63.024 ms (110 allocations: 6.72 KiB)
julia> @btime ppwrite()
23.464 ms (509 allocations: 25.63 KiB) # ~ 2.7x speedup
Doing the same thing with string representations of larger 10000x10000 matrices (3 instead of 10) results in
julia> @time swrite()
7.189072 seconds (23.60 k allocations: 1.208 MiB)
julia> @time swrite()
7.293704 seconds (37 allocations: 2.172 KiB)
julia> @time ppwrite();
16.818494 seconds (2.53 M allocations: 127.230 MiB) # > 2x slowdown of first call
julia> @time ppwrite(); # 30%$ slowdown of second call
9.729389 seconds (556 allocations: 35.453 KiB)
Upvotes: 4
Reputation: 69839
From your question I assume that you do not mean write in parallel (which probably would not speed up things due to the fact that the operation is probably IO bound).
Your solution has one small problem - it does not guarantee that f
is closed if write
throws an exception.
Here are three alternative ways to do it making sure the file is closed even on error:
for fname in ["file1.txt", "file2.txt"]
open(fname, "w") do f
write(f, "content")
end
end
for fname in ["file1.txt", "file2.txt"]
open(f -> write(f, "content"), fname, "w")
end
foreach(fn -> open(f -> write(f, "content"), fn, "w"),
["file1.txt", "file2.txt"])
They give the same result so the choice is a matter of taste (you can derive some more variations of similar implementations).
All the methods are based on the following method of open
function:
open(f::Function, args...; kwargs....)
Apply the function f to the result of open(args...; kwargs...)
and close the resulting file descriptor upon completion.
Observe that the processing will still be terminated if an exception is actually thrown somewhere (it is only guaranteed that the file descriptor will be closed). In order to ensure that every write operation is actually attempted you can do something like:
for fname in ["file1.txt", "file2.txt"]
try
open(fname, "w") do f
write(f, "content")
end
catch ex
# here decide what should happen on error
# you might want to investigate the value of ex here
end
end
See https://docs.julialang.org/en/latest/manual/control-flow/#The-try/catch-statement-1 for the documentation of try/catch
.
Upvotes: 6