Reputation: 41
I would like to verify the existence of a given file in a tar archive with Python before I get it as a file-like object. I've tried it with isreg()
, but probably I do something wrong.
How can I check if a file exists in a tar archive with Python?
I tried
import tarfile
tar = tarfile.open("sample.tar", "w")
tar.add("test1.txt")
tar.add("test2.txt")
tar.add("test3.py")
tar.close()
tar = tarfile.open("sample.tar", "r")
tai = tar.tarinfo(name="test3.py")
print(tai.isreg())
print(tai.size())
tar.close()
Probably tai is wrong. In fact tai.size()
is always 0.
Upvotes: 4
Views: 6614
Reputation: 9697
To retrieve all the files inside a tar archive you can use either the getmembers() or the getnames() methods of a TarFile
object. Then, to extract them, you can use either the extract() or extractfile() methods.
For example:
# Archive: "sample.tar" >> Content: "test1.txt", ...
filename = "test1.txt"
with tarfile.open("sample.tar", "r") as tar:
if filename in tar.getnames():
file = tar.extractfile(filename).read()
But take in mind that the names returned are actually relative file paths. Meaning that, if the "test1.txt" file you're looking for is stored in a "test" sub-directory inside the tar archive, than its TarInfo.name
will actually be "test/test1.txt".
So, going back to the previous example, you should do something like:
# Archive: "sample.tar" >> Content: "test", "test/test1.txt", ...
filename = "test1.txt"
with tarfile.open("sample.tar", "r") as tar:
for name in tar.getnames():
if name.endswith(filename):
file = tar.extractfile(name).read()
Finally, to test it, you can use @patch()
to mock the tarfile.open()
.
For example:
import unittest
from unittest.mock import patch
class TestTarfile(unittest.TestCase):
@patch('myfile.tarfile.open')
def test_tarfile_open(self, mock_open):
mock_open.return_value.__enter__.return_value.getnames.return_value = [
"test",
"test/test1.txt"
]
NOTE: As stated in the documentation, the support for using TarFile
objects as context managers in with statements was added starting from version 3.2.
Upvotes: 0
Reputation: 18385
If you really need to check, then you can test for membership using the getnames
method and the in
operator:
>>> import tarfile
>>> tar = tarfile.open("sample.tar", "w")
>>> "sample.tar" in tar.getnames()
True
However, I think that in Python (and dealing with file systems in general), catching exceptions are preferred. It's better to attempt to read and catch an exception because things can always happen between checking a file's existence and reading it later.
>>> try:
... tar.getmember('contents.txt')
... except KeyError:
... pass
...
Upvotes: 7
Reputation: 96081
This matches even if the tar file has the filename in a subdirectory, and uses normcase to mimic the filename case handling of the current OS (e.g. on Windows, searching for “readme.txt” should match “README.TXT” inside the tar file).
def filename_in_tar(filename, atarfile):
filename= os.path.normcase(filename)
return any(
filename == os.path.normcase(os.path.basename(tfn))
for tfn in atarfile.getnames())
Upvotes: 0
Reputation: 7701
You can use tar.getnames()
and the in
operator to do it:
$ touch a.txt
$ tar cvf a.tar a.txt
$ python
>>> names = tarfile.open('a.tar').getnames()
>>> 'a.txt' in names
True
>>> 'b.txt' in names
False
Upvotes: 0
Reputation: 61124
Maybe use getnames()
?
tar = tarfile.open('sample.tar','r')
if 'test3.py' in tar.getnames():
print 'test3.py is in sample.tar'
Upvotes: 0