Reputation: 6732
What's a more efficient way to compress and tar the same files to several servers at once? Right now I have:
for SERVER in $SERVERS; do
tar czpf - my/directory | ssh $SERVER tar xzpf - &
done && wait
It gets the job done, and running the loop in parallel is an improvement, but with ~20 servers that's a lot of redundant zipping on my end. Is there a way to compute the tarball only once, then duplicate that same output into each ssh command in the loop?
Upvotes: 1
Views: 91
Reputation: 247082
This will not be DRY, but process substitutions and tee
can do it:
tar czpf - my/directory | tee \
>(ssh server1 tar xzpf -) \
>(ssh server2 tar xzpf -) \
>(ssh server3 tar xzpf -) \
>(ssh server4 tar xzpf -) \
... \
>(ssh server20 tar xzpf -) \
>/dev/null
Redirecting to /dev/null
at the end prevents from printing the archive contents on the terminal.
...
I see this is just glglgl's point #1 spelled out explicitly. Made answer community wiki
Upvotes: 3
Reputation: 91119
You could do something with tee
and either a FIFO or (on more advanced shells, such as bash
) use the >(command)
construct.
This could work in 2 possible ways:
tee >(ssh host1 tar xzpf -) >(ssh host2 tar xzpf -)
. How to do so in Bash is not completely clear to me; either you could use an array but I am not sure), or you'll have to use eval
.You could start with an "initial" named FIFO where the tar
data is piped in. Then, on each iteration, you do mkfifo next_pipe; tee <previous_pipe >(ssh ...) > next_pipe
create a named FIFO in each iteration. At the end, you'll have to cat
the last pipe to /dev/null
in order to start the while, complidated pipe.
Something like
mkfifo 0
tar czpf - my/directory > 0
i=0
for SERVER in $SERVERS; do
i=$((i+1))
mkfifo $i
tee <$((i-1)) >(ssh $SERVER tar xzpf) >$i &
end
and then wait for all processes to finish. And then clean up your directory!
That said I personally would probably write a Python script which does all that tee
ing stuff and which works via pipes, not via named FIFOs. That, among others, relieves from the pain to clean up at the end.
If that is not an option, using a local intermediate file seems the best solution to me.
But if it is, here is a first shot of a Python program which does the job. It is an ad-hoc shot which is completely untested, but shows the idea.
#/usr/bin/env python
import sys
import subprocess
cmd1 = sys.argv[1]
cmd2 = sys.argv[2]
hosts = sys.argv[3:]
sp1 = subprocess.Popen(cmd1, shell=True, stdout=subprocess.PIPE)
spn = []
for host in hosts:
sp = subprocess.Popen(['ssh', host, cmd1], stdin=subprocess.PIPE)
spn.append(sp)
while True:
block = sp1.stdout.read(4096)
if not block: break
for sp in spn:
sp.stdin.write(block)
c = sp1.wait()
for sp in spn:
c2 = sp.wait()
if not c: c = c2
sys.exit(c)
Can be called with
./program 'tar czpf - my/directory -' 'tar xzpf' host1 host2 host3 ...
Upvotes: 2