Reputation: 122202
I am writing a pdf comparison utility. After some investigation it seems like the best way to do this is to convert to tiff and compare from there.
I managed to do this with Ghostscript but am getting a difference in the embedded creation date metadata.
How do I use .Net to modify this?
Upvotes: 3
Views: 1006
Reputation: 1
It seams that this ghostscript behavior could be supressed.
-dTIFFDateTime=false
https://www.ghostscript.com/doc/9.22/Devices.htm
... but for this situation I would recommend some diffpdf tools (http://soft.rubypdf.com/software/diffpdf)
D
Upvotes: 0
Reputation: 5639
If the date stamps are fixed size, a fun workaround for this type of problem is to write a FileStream
which simply detects and blanks out such date stamps. In fact I've done this before for PDF comparison, on a project I worked on in school. The checksum comparison worked fine with just that, without any conversion to tiff, though in our specific case we were sure all compared PDFs were generated by the same system, so that simplified things a bit.
The basic method is to make a subclass of FileStream
with overridden ReadByte
and Read
functions, which contains the length and expected format of the date stamps. Whenever a read is performed the following happens:
The source code I wrote for the project back in the day is here.
Upvotes: 0
Reputation: 800
After more investigation, it seems Microsoft does provide a TIFF library with multi-image support. It's in System.Windows.Media.Imaging. To get this namespace reference PresentationCore.
To access the TIFF metadata use this site as a reference: http://www.awaresystems.be/imaging/tiff/tifftags/baseline.html
This code accesses the date field after the GhostScript name you were interested in:
FileInfo fi = new FileInfo(@"C:\Users\Chris\Downloads\PdfVerificationTests.can_use_image_approval_mode.approved.tiff");
FileStream stream = fi.Open(FileMode.Open, FileAccess.ReadWrite,FileShare.None);
TiffBitmapDecoder decoder = new TiffBitmapDecoder(stream, BitmapCreateOptions.None, BitmapCacheOption.OnLoad);
BitmapMetadata bmd = (BitmapMetadata) decoder.Frames[0].Metadata;
string thedateval = (string) bmd.GetQuery("/ifd/{ushort=306}");
BitmapMetadata bmd2 = bmd.Clone();
bmd2.SetQuery("/ifd/{ushort=306}", "2013:05:30 20:07:52");
This code does not write out a modified TIFF, but is all the info you need to do so. Hope this helps as I feel I'm beating a dead horse.
This code will strip all the attributes from a multipage TIFF and leave the image content intact:
FileInfo fi = new FileInfo(@"C:\Users\Chris\Downloads\PdfVerificationTests.can_use_image_approval_mode.approved.tiff");
FileStream stream = fi.Open(FileMode.Open, FileAccess.ReadWrite, FileShare.None);
TiffBitmapDecoder decoder = new TiffBitmapDecoder(stream, BitmapCreateOptions.None, BitmapCacheOption.None);
FileStream stream2 = new FileStream("empty.tif", FileMode.Create);
TiffBitmapEncoder encoder = new TiffBitmapEncoder();
for (int i = 0; i < decoder.Frames.Count(); i++)
{
BitmapSource source = decoder.Frames[i];
int stride = source.PixelWidth * (source.Format.BitsPerPixel / 8);
byte[] data = new byte[stride * source.PixelHeight];
source.CopyPixels(data, stride, 0);
CachedBitmap theSource = (CachedBitmap)BitmapSource.Create(source.PixelWidth, source.PixelHeight, source.DpiX, source.DpiY, source.Format, source.Palette, data, stride);
encoder.Frames.Add(BitmapFrame.Create(theSource));
}
try
{
encoder.Save(stream2);
stream2.Close();
stream.Close();
}
catch
{
}
Upvotes: 1
Reputation: 48726
You can use LibTiff.NET. It is open source. Using this library, you can use the SetField method to modify any one of the many tags in the Tiff file, including the TiffTag.DATETIME flag.
Upvotes: 1