I try to hash a string using SHA256, I'm using the following code:
using System;
using System.Security.Cryptography;
using System.Text;
public class Hash
{
public static string getHashSha256(string text)
{
byte[] bytes = Encoding.Unicode.GetBytes(text);
SHA256Managed hashstring = new SHA256Managed();
byte[] hash = hashstring.ComputeHash(bytes);
string hashString = string.Empty;
foreach (byte x in hash)
{
hashString += String.Format("{0:x2}", x);
}
return hashString;
}
}
However, this code gives significantly different results compared to my friends php, as well as online generators (such as This generator)
Does anyone know what the error is? Different bases?
Encoding.Unicode
is Microsoft's misleading name for UTF-16 (a double-wide encoding, used in the Windows world for historical reasons but not used by anyone else). http://msdn.microsoft.com/en-us/library/system.text.encoding.unicode.aspx
If you inspect your bytes
array, you'll see that every second byte is 0x00
(because of the double-wide encoding).
You should be using Encoding.UTF8.GetBytes
instead.
But also, you will see different results depending on whether or not you consider the terminating '\0'
byte to be part of the data you're hashing. Hashing the two bytes "Hi"
will give a different result from hashing the three bytes "Hi"
. You'll have to decide which you want to do. (Presumably you want to do whichever one your friend's PHP code is doing.)
For ASCII text, Encoding.UTF8
will definitely be suitable. If you're aiming for perfect compatibility with your friend's code, even on non-ASCII inputs, you'd better try a few test cases with non-ASCII characters such as é
and 家
and see whether your results still match up. If not, you'll have to figure out what encoding your friend is really using; it might be one of the 8-bit "code pages" that used to be popular before the invention of Unicode. (Again, I think Windows is the main reason that anyone still needs to worry about "code pages".)