Zach Young
Zach Young

Reputation: 11233

is_tarfile() returns True for a blank file

EDIT 1

Hmm, I accept the answers that tar respects an empty file... but on my system:

$ touch emptytar
$ tar -tf emptytar 
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors

Maybe I have a non-canonical version?

$ tar --version
tar (GNU tar) 1.22
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.

Hello all,

I am testing some logic to handle a user uploading a TAR file. When I feed a blank file to tarfile.is_tarfile() it returns True, which is not what I am expecting:

$ touch tartest
$ cat tartest
$ python -c "import tarfile; print tarfile.is_tarfile('tartest')"
True

If I add some text to the file, it returns False, which I am expecting:

$ echo "not a tar" > tartest
$ python -c "import tarfile; print tarfile.is_tarfile('tartest')"
False

I could add a check at the beginning to check for a zero-length file, but based on the documentation for tarfile.is_tarfile(name) I think this is unecessary:

Return True if name is a tar archive file, that the tarfile module can read.

I went so far as to check the source, tarfile.py, and I can see that it is checking header blocks but I do not fully understand how it is evaluating those blocks.

Am I misreading the documentation and therefore setting unfair expectations?

Thank you,
Zachary

Upvotes: 7

Views: 2089

Answers (4)

Stephen C
Stephen C

Reputation: 719709

In fact, the behaviour of "is_tarfile" seems to have changed between Python 2.6 and 2.7. In Python 2.7, is_tarfile returns False for an empty file.

$ touch /tmp/foo.tar
$ python
Python 2.7.3 (default, Jul 24 2012, 11:41:40) 
[GCC 4.6.3 20120306 (Red Hat 4.6.3-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tarfile
>>> print tarfile.is_tarfile("/tmp/foo.tar")
False
>>> 
$ 

Upvotes: 1

Alex Martelli
Alex Martelli

Reputation: 882781

An empty tar file is a perfectly valid, and empty, tar file. Consider, at any Unix shell prompt:

$ touch foo.tar
$ ls -l foo.tar
-rw-r--r--  1 aleax  staff  0 Jun 16 18:49 foo.tar
$ tar tvf foo.tar 
$ tar xvf foo.tar

See? The empty foo.tar is a perfectly valid tar file for the Unix tar command -- it just has nothing to show or to unpack. It would be truly problematic if Python's tar handling differed so drastically from that of tar itself! What sentence in the docs led you to believe that such a problematic, headache-inducing incompatibility is part of the specs?

Upvotes: 4

dkamins
dkamins

Reputation: 21948

Try this at the command line:

$ touch emptyfile
$ tar -tvf emptyfile

No errors.

It looks like an empty file simply is a valid (but useless) TAR file.

Upvotes: 1

S.Lott
S.Lott

Reputation: 392070

This is a fundamental feature of logic.

The default assumption is "True" until proven false by the contents of the file.

No contents, no disproof of the assumption.

Upvotes: -1

Related Questions