Reputation: 99
How is it possible to get signature Value from signed PDF file? I can get all other data from signature except its value. Is there any way to get it in C#?
PdfPKCS7 pk;
PdfReader reader = new PdfReader(PdfFilename);
AcroFields af = reader.AcroFields;
var names = af.GetSignatureNames();
foreach (string name in names)
{
pk = af.VerifySignature(name);
var CN_signer = iTextSharp.text.pdf.security.CertificateInfo.GetSubjectFields(pk.SigningCertificate).GetField("CN");
var C_signer = iTextSharp.text.pdf.security.CertificateInfo.GetSubjectFields(pk.SigningCertificate).GetField("C");
var CN_issuer = iTextSharp.text.pdf.security.CertificateInfo.GetIssuerFields(pk.SigningCertificate).GetField("CN");
var OU_issuer = iTextSharp.text.pdf.security.CertificateInfo.GetIssuerFields(pk.SigningCertificate).GetField("OU");
var O_issuer= iTextSharp.text.pdf.security.CertificateInfo.GetIssuerFields(pk.SigningCertificate).GetField("O");
var C_issuer = iTextSharp.text.pdf.security.CertificateInfo.GetIssuerFields(pk.SigningCertificate).GetField("C");
var nr_serial = pk.SigningCertificate.SerialNumber;
var date = pk.SignDate.ToString();
Upvotes: 2
Views: 7671
Reputation: 96039
The OP clarified that the signature Value was meant to refer to the PKCS#7/CMS signature container. The following sample method can do just that:
public void showSignatureValues(PdfReader reader)
{
AcroFields fields = reader.AcroFields;
foreach (String name in fields.GetSignatureNames())
{
Console.Write(" Signature {0}\n", name);
PdfDictionary sigDict = fields.GetSignatureDictionary(name);
PdfName subFilter = sigDict.GetAsName(PdfName.SUBFILTER);
Console.Write(" SubFilter {0}\n", subFilter);
PdfString contents = sigDict.GetAsString(PdfName.CONTENTS);
if (contents != null)
{
byte[] contentBytes = contents.GetOriginalBytes();
string contentBas64 = Convert.ToBase64String(contentBytes);
// contentBytes contains the actual signature container as is,
// contentBas64 contains it encoded using Base64 for better printability
Console.Write(" Content {0}\n", contentBas64);
}
}
}
One remark, though: You will find that the contentBytes
usually contains numerous 00
bytes after the signature container bytes (in the Base64 representation they show as a long string of letters A
). This is because very often a generous estimate concerning the signature container size is made when preparing a PDF for signing, and more than enough space is reserved for the injection of it.
According to the specification, since the length of PKCS#7 objects is not entirely predictable, the value of Contents shall be padded with zeros at the end.
Using an ASN.1 parser you can determine how long the actual signature container byte sequence is and where the padding starts.
In theory the value of Contents shall be a DER-encoded PKCS#7 binary data object; as DER encoding rules do not allow the indefinite-length method, the size of the signature container should be determinable according to the leading first few bytes. Unfortunately there are numerous PDFs in the wild which contain the outer layers of the signature container merely BER encoded and only certain inner objects DER encoded. Thus, complete parsing can be required.
In the answer above I claimed bluntly that the sample code returns a PKCS#7/CMS signature container. Actually it is such a signature container only in most cases, it depends on the SubFilter of the signature field value.
Let's look at the SubFilter values defined in ISO 32000-1 (the PDF specification) and in the ETSI Technical Specification 102778 parts (PAdES):
adbe.x509.rsa_sha1 ISO 32000-1 - In this case the contents actually are a DER-encoded PKCS#1 binary data object. This is the case depicted in the OP's graphic
The OP here calls the contents an encrypted digest which is only part of the truth because
the PKCS#1 data object is constructed not from the bare digest but from a structure containing both that digest and the OID of the digest algorithm, and
depending on the signature algorithm this structure may not be encrypted (as something that can be decrypted back to the digest again) but instead only a number may be derived from it which cannot be decrypted back to the structure but merely tested against an alleged document digest structure.
This format nowadays hardly is in use anymore.
adbe.pkcs7.detached ISO 32000-1, ETSI TS 102778-2 - The contents are a DER-encoded PKCS#7 binary data object signing the byte range directly, i.e. normally the byte range digest is in the signed attribute MessageDigest
.
adbe.pkcs7.sha1 ISO 32000-1, ETSI TS 102778-2 - The contents are a DER-encoded PKCS#7 binary data object signing the byte range indirectly, i.e. the byte range SHA1 digest is put into the container as data which in turn is signed normally.
ETSI.CAdES.detached ETSI TS 102778-3 - The contents are a DER-encoded SignedData object as specified in CMS signing the byte range directly, essentially this is a specially profiled variant of adbe.pkcs7.detached.
ETSI.RFC3161 ETSI TS 102778-4 - The contents are a TimeStampToken as specified in RFC 3161 stamping the byte range directly; this is a time stamp format closely related to PKCS#7. (This is a special case as the form field type is not Sig but DocTimeStamp.)
Only in case of adbe.x509.rsa_sha1 the certificates involved are included in separate signature dictionary entries. In all other cases certificates (and other security related material) are included in the SignedData
structure in the Contents.
Upvotes: 8