Unable to verify digital signature using Apache PDFBOX

Ranjan picture Ranjan · Sep 21, 2014 · Viewed 7.1k times · Source

I am a newbie in using Digital Signatures. In one of the projects we are using Apache PdfBox for processing digitally signed pdf files. While we could test all features, verification of signed pdf files is something we are unable to crack. We are using BouncyCastle as the provider. Below is the code:

//Get Digital Signature and Signed Content from pdf file

byte[] signatureAsBytes = pdsignature.getContents(new FileInputStream(this.INPUT_FILE));
byte[] signedContentAsBytes = pdsignature.getSignedContent(new FileInputStream(this.INPUT_FILE));

//Digital Signature Verification

Security.addProvider(new BouncyCastleProvider());
Signature signer = Signature.getInstance("RSA","BC");

//Get PublicKey from p7b file
X509Certificate cert509=null;
File file = new File("C:\\certificate_file.p7b");
FileInputStream fis = new FileInputStream(file);
CertificateFactory cf = CertificateFactory.getInstance("X.509");
Collection c = cf.generateCertificates(fis);
Iterator it = c.iterator();
PublicKey pubkey;

while (it.hasNext()) 
{
   cert509 = (X509Certificate) it.next();
   pubkey = cert509.getPublicKey();
}

boolean VERIFIED=false;
Security.addProvider(new BouncyCastleProvider());
Signature signer = Signature.getInstance("RSA","BC");
PublicKey key=this.getPublicKey(false);
signer.initVerify(key);

List<PDSignature> allsigs = this.PDFDOC.getSignatureDictionaries();
Iterator<PDSignature> i = allsigs.iterator();

while(i.hasNext())
{
        PDSignature sig = (PDSignature) i.next();
        byte[] signatureAsBytes = sig.getContents(new FileInputStream(this.INPUT_FILE));
        byte[] signedContentAsBytes = sig.getSignedContent(new FileInputStream(this.INPUT_FILE));
        signer.update(signedContentAsBytes);
        VERIFIED=signer.verify(signatureAsBytes);
}

System.out.println("Verified="+VERIFIED);

Below are relevant extracts from the certificate in p7b format - I am using BouncyCastle as security provider:

  Signature Algorithm: SHA256withRSA, OID = 1.2.840.113549.1.1.11
  Key:  Sun RSA public key, 2048 bits
  Validity: [From: Tue Aug 06 12:26:47 IST 2013,
  To: Wed Aug 05 12:26:47 IST 2015]
  Algorithm: [SHA256withRSA]

With the above code I am always getting response as "false". I have no idea how to fix the issue. Please help

Answer

mkl picture mkl · Sep 22, 2014

Your prime problem is that there are multiple types of PDF signatures differing in the format of the signature container and in what actually are the signed bytes. Your BC code, on the other hand, can verify merely naked signature byte sequences which are contained in the afore-mentioned signature containers.

Interoperable signature types

As the header already says, the following list contains "interoperable signature types" which are more or less strictly defined. The PDF specification specifies a way to also include completely custom signing schemes. But let us assume we are in an interoperable situation. The the collection of signature types burns down to:

  • adbe.x509.rsa_sha1 defined in ISO 32000-1 section 12.8.3.2 PKCS#1 Signatures; the signature value Contents contain a DER-encoded PKCS#1 binary data object; this data object is a fairly naked signature, in case of RSA an encrypted structure containing the padded document hash and the hash algorithm.

  • adbe.pkcs7.sha1 defined in ISO 32000-1 section 12.8.3.3 PKCS#7 Signatures; the signature value Contents contain a DER-encoded PKCS#7 binary data object; this data object is a big container object which can also contain meta-information, e.g. it may contain certificates for building certificate chains, revocation information for certificate revocation checks, digital time stamps to fix the signing time, ... The SHA1 digest of the document’s byte range shall be encapsulated in the PKCS#7 SignedData field with ContentInfo of type Data. The digest of that SignedData shall be incorporated as the normal PKCS#7 digest.

  • adbe.pkcs7.detached defined in ISO 32000-1 section 12.8.3.3 PKCS#7 Signatures; the signature value Contents contain a DER-encoded PKCS#7 binary data object, see above. The original signed message digest over the document’s byte range shall be incorporated as the normal PKCS#7 SignedData field. No data shall be encapsulated in the PKCS#7 SignedData field.

  • ETSI.CAdES.detached defined in ETSI TS 102 778-3 and will become integrated in ISO 32000-2; the signature value Contents contain a DER-encoded SignedData object as specified in CMS; CMS signature containers are close relatives to PKCS#7 signature containers, see above. This essentially is a differently profiled and stricter defined variant of adbe.pkcs7.detached.

  • ETSI.RFC3161 defined in ETSI TS 102 778-4 and will become integrated in ISO 32000-2; the signature value Contents contain a TimeStampToken as specified in RFC 3161; time stamp tokens again are a close relative to PKCS#7 signature containers, see above, but they contain a special data sub-structure harboring the document hash, the time of the stamp creation, and information on the issuing time server.

I would propose studying the specifications I named and the documents referenced from there, mostly RFCs. Based on that knowledge you can easily find the appropriate BouncyCastle classes to analyze the different signature Contents.