Reputation: 4035
What system call does tar
use to get the content of files that it uses to create an archive? I tried using strace to see the call, but it never calls open
on the file.
$ echo "HelloWorld" > my_test_file
$ strace -s250 -f -F tar -cf /dev/null my_test_file 2>&1 | grep my_test_file
execve("/bin/tar", ["tar", "-cf", "/dev/null", "my_test_file"], [/* 20 vars */]) = 0
newfstatat(AT_FDCWD, "my_test_file", {st_mode=S_IFREG|0664, st_size=11, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "my_test_file", {st_mode=S_IFREG|0664, st_size=11, ...}, AT_SYMLINK_NOFOLLOW) = 0
I am guessing the newfstatat is pretty much the same thing as fstatat (which "operates in exactly the same way as stat" except for some minor differences), so that probably isn't opening the file.
My version of tar:
$ tar --version
tar (GNU tar) 1.26
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
My operating system:
$ uname -a
Linux myhostname 3.11.0-14-generic #21-Ubuntu SMP Tue Nov 12 17:04:55 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=13.10
DISTRIB_CODENAME=saucy
DISTRIB_DESCRIPTION="Ubuntu 13.10"
Upvotes: 1
Views: 438
Reputation: 13551
To me it seems that the source file is not read when writing to /dev/null
and when it has zero size.
cd /tmp; echo test > testinput; diff -u <(strace -s250 -f tar -cf /dev/null testinput 2>&1) <(strace -s250 -f tar -cf testoutput testinput 2>&1) | less +'/open\("testinput"'
Open is used on input file when output is not /dev/null
and the input file is not empty. Using GNU tar 1.20 and strace 4.5.17.
Upvotes: 1
Reputation: 19266
Obviously, when you're tar
ing a file, it must be read by the process running tar
. This is exactly what happens on my system. I created a 512-byte file from /dev/urandom
and ran tar -cf file.tar file.xyz
. After filtering out all the noise related to loading libraries into the process' image, you can see the actual relevant lines that strace
reports :
creat("file.tar", 0666) = 3
We can see that the output file from the tar
command is being created with read/write permissions for the owner, group, and world (which is probably influenced by the umask reported by your shell), and the new file's descriptor inside this process is 3.
openat(AT_FDCWD, "file.xyz", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4
Here, the file to be archived is opened and assigned the file descriptor 4.
fstat(4, {st_mode=S_IFREG|0644, st_size=512, ...}) = 0
tar
calls fstat
on an open file descriptor in order to find out if the file is readable and its size (probably).
read(4, "\225\243\263uG\320-\354!%\337\3376\311\210&\377T=aiO\10\203\375|y\304\231\203x."..., 512) = 512
We can see the file being actually read.
close(4) = 0
And properly closed.
write(3, "file.xyz\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240
The file referenced by descriptor 3 - our output file - is being written to. We can't really see the contents of file.xyz
in the write
call, but this is probably because of the structure of the tar file.
close(3) = 0
Now, the output file is closed, as well as the whole process (not shown here).
Interestingly, at first I created an empty file with touch
, and tried to tar
it. However, it seems like tar
checks if the file is empty and, if it is, does not insert the data inside the tar archive. newfstatat
returns the information about the size, which tar
probably uses to make this decision.
However, you should really read the source to see how the actual execution looks. It is possible that, for example, files which are much larger are mmap
ed into the process, and read this way, while smaller files are simply read with read
.
Upvotes: 2