Different results with Java's digest versus external utilities

Mike Viens picture Mike Viens · Mar 15, 2012 · Viewed 13.6k times · Source

I have written a simple Java class to generate the hash values of the Windows Calculator file. I am using Windows 7 Professional with SP1. I have tried Java 6.0.29 and Java 7.0.03. Can someone tell me why I am getting different hash values from Java versus (many!) external utilities and/or websites? Everything external matches with each other, only Java is returning different results.

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Map.Entry;
import java.util.zip.CRC32;
import java.security.DigestInputStream;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

public class Checksum 
{
    private static int size = 65536;
    private static File calc = new File("C:/Windows/system32/calc.exe");

    /*
        C:\Windows\System32\calc.exe (verified via several different utilities)
        ----------------------------
        CRC-32b = 8D8F5F8E
        MD5     = 60B7C0FEAD45F2066E5B805A91F4F0FC
        SHA-1   = 9018A7D6CDBE859A430E8794E73381F77C840BE0
        SHA-256 = 80C10EE5F21F92F89CBC293A59D2FD4C01C7958AACAD15642558DB700943FA22
        SHA-384 = 551186C804C17B4CCDA07FD5FE83A32B48B4D173DAC3262F16489029894FC008A501B50AB9B53158B429031B043043D2
        SHA-512 = 68B9F9C00FC64DF946684CE81A72A2624F0FC07E07C0C8B3DB2FAE8C9C0415BD1B4A03AD7FFA96985AF0CC5E0410F6C5E29A30200EFFF21AB4B01369A3C59B58


        Results from this class
        -----------------------
        CRC-32  = 967E5DDE
        MD5     = 10E4A1D2132CCB5C6759F038CDB6F3C9
        SHA-1   = 42D36EEB2140441B48287B7CD30B38105986D68F
        SHA-256 = C6A91CBA00BF87CDB064C49ADAAC82255CBEC6FDD48FD21F9B3B96ABF019916B    
    */    

    public static void main(String[] args)throws Exception {
        Map<String, String> hashes = getFileHash(calc);
        for (Map.Entry<String, String> entry : hashes.entrySet()) {
            System.out.println(String.format("%-7s = %s", entry.getKey(), entry.getValue()));
        }
    }

    private static Map<String, String> getFileHash(File file) throws NoSuchAlgorithmException, IOException {
        Map<String, String> results = new LinkedHashMap<String, String>();

        if (file != null && file.exists()) {
            CRC32 crc32 = new CRC32();
            MessageDigest md5 = MessageDigest.getInstance("MD5");
            MessageDigest sha1 = MessageDigest.getInstance("SHA-1");
            MessageDigest sha256 = MessageDigest.getInstance("SHA-256");

            FileInputStream fis = new FileInputStream(file);
            byte data[] = new byte[size];
            int len = 0;
            while ((len = fis.read(data)) != -1) {
                crc32.update(data, 0, len);
                md5.update(data, 0, len);
                sha1.update(data, 0, len);
                sha256.update(data, 0, len);
            }
            fis.close();

            results.put("CRC-32", toHex(crc32.getValue()));
            results.put(md5.getAlgorithm(), toHex(md5.digest()));
            results.put(sha1.getAlgorithm(), toHex(sha1.digest()));
            results.put(sha256.getAlgorithm(), toHex(sha256.digest()));
        }
        return results;
    }

    private static String toHex(byte[] bytes) {
        String result = "";
        if (bytes != null) {
            StringBuilder sb = new StringBuilder(bytes.length * 2);
            for (byte element : bytes) {
                if ((element & 0xff) < 0x10) {
                    sb.append("0");
                }
                sb.append(Long.toString(element & 0xff, 16));
            }
            result = sb.toString().toUpperCase();
        }
        return result;
    }

    private static String toHex(long value) {
        return Long.toHexString(value).toUpperCase();
    }

}

Answer

Jon Skeet picture Jon Skeet · Mar 15, 2012

Got it. The Windows file system is behaving differently depending on the architecture of your process. This article explains it all - in particular:

But what about 32-bit applications that have the system path hard coded and is running in a 64-bit Windows? How can they find the new SysWOW64 folder without changes in the program code, you might think. The answer is that the emulator redirects calls to System32 folder to the SysWOW64 folder transparently so even if the folder is hard coded to the System32 folder (like C:\Windows\System32), the emulator will make sure that the SysWOW64 folder is used instead. So same source code, that uses the System32 folder, can be compiled to both 32-bit and 64-bit program code without any changes.

Try copying calc.exe to somewhere else... then run the same tools again. You'll get the same results as Java. Something about the Windows file system is giving different data to the tools than it's giving to Java... I'm sure it's something to do with it being in the Windows directory, and thus probably handled "differently".

Furthermore, I've reproduced it in C#... and found out that it depends on the architecture of the process you're running. So here's a sample program:

using System;
using System.IO;
using System.Security.Cryptography;

class Test
{
    static void Main()
    {
        using (var md5 = MD5.Create())
        {
            string path = "c:/Windows/System32/Calc.exe";
            var bytes = md5.ComputeHash(File.ReadAllBytes(path));
            Console.WriteLine(BitConverter.ToString(bytes));
        }
    }
}

And here's a console session (minus chatter from the compiler):

c:\users\jon\Test>csc /platform:x86 Test.cs    

c:\users\jon\Test>test
60-B7-C0-FE-AD-45-F2-06-6E-5B-80-5A-91-F4-F0-FC

c:\users\jon\Test>csc /platform:x64 Test.cs

c:\users\jon\Test>test
10-E4-A1-D2-13-2C-CB-5C-67-59-F0-38-CD-B6-F3-C9