Reputation: 777
I was wondering if there was a way to determine in Python (or another language) to open a JPEG file, and determine whether or not it is corrupt (for instance, if I terminate a download for a JPG file before it completes, then I am unable to open the file and view it)? Are there libraries that allow this to be done easily?
Upvotes: 6
Views: 3775
Reputation: 43533
You can try using PIL. But just opening a truncated JPG file won't fail, and neither will the verify
method. Trying to load it will raise an exception, though;
First we mangle a good jpg file:
> du mvc-002f.jpg
56 mvc-002f.jpg
> dd if=mvc-002f.jpg of=broken.jpg bs=1k count=20
20+0 records in
20+0 records out
20480 bytes transferred in 0.000133 secs (154217856 bytes/sec)
Then we try the Python Imaging Library:
>>> import Image
>>> im = Image.open('broken.jpg')
>>> im.verify()
>>> im = Image.open('broken.jpg') # im.verify() invalidates the file pointer
>>> im.load()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/PIL/ImageFile.py", line 201, in load
raise IOError("image file is truncated (%d bytes not processed)" % len(b))
IOError: image file is truncated (16 bytes not processed)
As user827992 said, even a truncated image can usually still be partially decoded and shown.
Upvotes: 7
Reputation: 1753
I don't think so.
The JPEG standard is more like a container rather than a standard about the implementation.
The word corrupted usually mean that the file no longer represent the original data but most of the time can still be decoded, it will produce an undefined output, not the one that is supposed to produce, but putted in a JPEG decoder most likely it is going to output something, also since there is no way to associate an unique bit arrangement to the JPEG file format you can't do this programmatically, you don't have a specific pattern and even if you have it you can't say that a bit is the wrong place or is missing without knowing what is the original content when only parsing the actual file.
Also the header of the file can be corrupted but in this case your file is probably designated as corrupted without caring about "what is", is corrupted as any generic file can be.
Upvotes: 0
Reputation: 5537
You could do it using PIL package:
import Image
def is_image_ok(fn):
try:
Image.open(fn)
return True
except:
return False
Upvotes: 0