Reputation: 260
In the python documentation, it is adviced not to extract a tar archive without prior inspection. What is the best way to make sure an archive is safe using the tarfile python module? Should I just iterate over all the filename and check wether they contain absolute pathnames?
Would something like the following be sufficient?
import sys
import tarfile
with tarfile.open('sample.tar', 'r') as tarf:
for n in tarf.names():
if n[0] == '/' or n[0:2] == '..':
print 'sample.tar contains unsafe filenames'
sys.exit(1)
tarf.extractall()
This script is not compatible with versions prior to 2.7. cf with and tarfile.
I now iterate over the members:
target_dir = "/target/"
with closing(tarfile.open('sample.tar', mode='r:gz')) as tarf:
for m in tarf:
pathn = os.path.abspath(os.path.join(target_dir, m.name))
if not pathn.startswith(target_dir):
print 'The tar file contains unsafe filenames. Aborting.'
sys.exit(1)
tarf.extract(m, path=tdir)
Upvotes: 4
Views: 1753
Reputation: 154682
Almost, although it would still be possible to have a path like foo/../../
.
Better would be to use os.path.join
and os.path.abspath
, which together will correctly handle leading /
and ..
s anywhere in the path:
target_dir = "/target/" # trailing slash is important
with tarfile.open(…) as tarf:
for n in tarf.names:
if not os.path.abspath(os.path.join(target_dir, n)).startswith(target_dir):
print "unsafe filenames!"
sys.exit(1)
tarf.extractall(path=target_dir)
Upvotes: 4