Separate Chaining: Hashing collision handling technique
August 4, 2017
Check if a Tree is Almost Complete Binary Tree
September 9, 2017

HashMap performance improvement in Java 8

What improvement has java-8 made that performance of HashMap.get() function call is fast by 20% in Java-8.
Java language’s utils package has HasMap class that implements Map interface using a hash table. It implements two constant-time functions put and get that stores and retrieves values in HashMap. HashMap stores information in Key-Value pair as demonstrated in below Code.

import java.util.*;  
public class RTDemo 
{
  public static void main(String args[])
  {  
    HashMap<Integer,String> hash = new HashMap<Integer,String>();
    hash.put(1000, "Ritambhara");
    hash.put(2000, "Moksha");  
    hash.put(3000, "Radha");  
    System.out.println("Value for 1000 is: " + hash.get(1000));
    System.out.println("Value for 2000 is: " + hash.get(2000));
    System.out.println("Value for 3000 is: " + hash.get(3000));
  }
}

The output of above Code is

Value for 1000 is: Ritambhara
Value for 2000 is: Moksha
Value for 3000 is: Radha

HashMap<Integer,String> declares a HashMap with String values stored on Integer keys. Obviously, the implementation cannot keep an array of size 3000 to store these values. The key is passed to a hash function that generates a smaller value indicating index in the actual hash table, something like.

Index = hashFunction(1000);

let us assume that size of hash table is two (Size of hash table is usually higher so that each bucket holds (preferably) single entry) and hashFunction is as below (HashMap in Java uses hashCode() and equals() method of key)

int hashFunction(int n)
{
  return n%2;
}

Index for 1000, 2000 and 3000 are 1, 2 and 1 respectively. Both 1000 and 3000 generate same index. This is called Collision in hashing. There are multiple ways to handle collision, one of them is thru chaining.
Hash table holds a pointer to the head of linked list of records having same hash value.

Calling get function with key 3000, retrieves the value in two steps:

  1. Find the index of hash table that stores key 3000 using same hash function. This value comes out to be 1.
  2. Traverse the list of index 1 in the hash table to find record corresponding to key 3000.

In worst case, all keys correspond to the same index. Searching in such a hash table takes O(n) time. But this is an almost impossible situation for a good hash table implementation. However, there are possibilities that individual linked list has many values.
As you can see each collision has a direct impact on performance. Since searching in a linked list is linear, worst-case time is O(k), where k is number of nodes in the linked list.
What java-8 has done is, when a linked list has more nodes, they replace it with a balanced binary tree dynamically improving the search time from O(k) to O(lg(k)). This simple improvement makes HashMap.get()function call of Java 8, on an average, 20% faster than Java 7.

Leave a Reply

Your email address will not be published. Required fields are marked *