I'm getting two different hashes of the same xml document when I directly canonicalize some xml than when I perform a digital signature on it which also performs the same canonicalization algoririth on the xml before hashing it? I worked out that the digital signature canonicalization includes the new line characters '\n' and spacing characters when canonicalizing and the direct algorithm does not.
Including the new line characters + spaces is not in the canonicalization specification though? I'm specifically looking at this version http://www.w3.org/TR/2001/REC-xml-c14n-20010315
Does anyone know what is going on? I've included the xml doc and both implementations of the code so you can see.
This is really puzzling me and I'd like to know why, am I missing something obvious?
<root>
<child1>some text</child1>
<child2 attr="1" />
</root>
The direct canonicalization code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Security.Cryptography.Xml;
using System.Security.Cryptography;
using System.IO;
using System.ComponentModel;
namespace XML_SignatureGenerator
{
class XML_C14N
{
private String _filename;
private Boolean isCommented = false;
private XmlDocument xmlDoc = null;
public XML_C14N(String filename)
{
_filename = filename;
xmlDoc = new XmlDocument();
xmlDoc.Load(_filename);
}
//implement this spec http://www.w3.org/TR/2001/REC-xml-c14n-20010315
public String XML_Canonalize(System.Windows.Forms.RichTextBox tb)
{
//create c14n instance and load in xml file
XmlDsigC14NTransform c14n = new XmlDsigC14NTransform(isCommented);
c14n.LoadInput(xmlDoc);
//get canonalised stream
Stream s1 = (Stream)c14n.GetOutput(typeof(Stream));
SHA1 sha1 = new SHA1CryptoServiceProvider();
Byte[] output = sha1.ComputeHash(s1);
tb.Text = Convert.ToBase64String(output);
//create new xmldocument and save
String newFilename = _filename.Substring(0, _filename.Length - 4) + "C14N.xml";
XmlDocument xmldoc2 = new XmlDocument();
xmldoc2.Load(s1);
xmldoc2.Save(newFilename);
return newFilename;
}
public void set_isCommented(Boolean value)
{
isCommented = value;
}
public Boolean get_isCommented()
{
return isCommented;
}
}
}
The xml digital signature code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Security.Cryptography;
using System.Security.Cryptography.Xml;
namespace XML_SignatureGenerator
{
class xmlSignature
{
public xmlSignature(String filename)
{
_filename = filename;
}
public Boolean SignXML()
{
RSACryptoServiceProvider rsa = new RSACryptoServiceProvider();
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.PreserveWhitespace = true;
String fname = _filename; //"C:\\SigTest.xml";
xmlDoc.Load(fname);
SignedXml xmlSig = new SignedXml(xmlDoc);
xmlSig.SigningKey = rsa;
Reference reference = new Reference();
reference.Uri = "";
XmlDsigC14NTransform env = new XmlDsigC14NTransform(false);
reference.AddTransform(env);
xmlSig.AddReference(reference);
xmlSig.ComputeSignature();
XmlElement xmlDigitalSignature = xmlSig.GetXml();
xmlDoc.DocumentElement.AppendChild(xmlDoc.ImportNode(xmlDigitalSignature, true));
xmlDoc.Save(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "/SignedXML.xml");
return true;
}
private String _filename;
}
}
Any idea would be great! It's all C# code by the way.
Thanks in advance
Jon
The way in which XML Sig handles whitespace is, in my opinion broken. It's certainly not compliant with what most right-thinking people would call canonicalization. Changing whitespace should not affect the digest, but in xmlsig, it does.
One possible workaround is to pass the document through a canonicalizer routine before passing it to the signature generation code. That should make things far more predictable.
This article might help clarify things.