Reputation: 10001
I have a tarball that I can't open using python:
>>> import tarfile
>>> tarfile.open('/tmp/bad.tar.gz')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tarfile.py", line 1672, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
but I'm able to extract the file with no problem on the command line.
$ tar -xzvf /tmp/bad.tar.gz
I've traced the python tarfile
code, and there's a function "nti
" where they're converting bytes. It gets to this line:
obj.uid = nti(buf[108:116])
and blows up. These bits (for the UID) coming through as eight spaces. Not sure where to go from here...
Upvotes: 2
Views: 4325
Reputation: 10001
Honestly it looks like the bug is in tarfile.py
's nti
function:
n = int(nts(s) or "0", 8)
The fall-through logic (or "0"
) is not working because s
is spaces, not None
, so int()
blows up.
I copied tarfile.py
from /var/lib/python2.7/
and wrapped that particular line with a try/catch, which fixed me up:
try:
obj.uid = nti(buf[108:116])
except InvalidHeaderError:
obj.uid = 0
It's a hack solution, though. Really I'd prefer that the python folk took a look at it and fixed the "or "0"
logic.
Update
Turns out the tarball was created by the maven-assembly-plugin
in a Java 6 project that had just been upgraded to Java 7. The issue was resolved by upgrading the maven-assembly-plugin
to 2.5.3
.
Upvotes: 1