hash and indexFor method in HashMap

Java HashMap:

Target AudienceThis post is for the people who already have good understanding of how HashMap works in java and want to understand more about hash and indexFor method.

In this post, we will see how hash and indexFor method works internally in hashmap. hash and indexFor methods belong to HashMap class. Why JDK developers need to have another hash function when  key object’s have there own hashcode method.
Lets see code for hash and indexFor

As java doc says about hash:
Applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions. This is critical because HashMap uses power-of-two length hash tables, that otherwise encounter collisions for hashCodes that do not differ in lower bits. Note: Null keys always map to hash 0, thus index 0.

Lets understand this with the help of example: Lets say key object’s hashcode() returns only 3 values 31,63 and 95. 31,63 and 95 are integers, so all are 32 bits.
31=00000000000000000000000000011111
63=00000000000000000000000000111111
95=00000000000000000000000001011111
Now lets say our hashmap’s lenth is 16(2^4, Hashmap’s length will be always in power of two)

What if we do not use hash function:

Now if we do not use hash function then
indexFor will return :

31=00000000000000000000000000011111 => 1111=15
63=00000000000000000000000000111111  => 1111=15
95=00000000000000000000000001011111 => 1111=15

Why so?
because when we call indexFor function.It will do AND between 31&15 , 63&15 and 95&15.
For ex. 95&15
00000000000000000000000001011111 &00000000000000000000000000001111
so (2^n-1) will always have sequence of 1’s in it.so what matters here it last n bits as 0&1 is always 0.
In above example, all have 1111 in end that is why they are returning same index.
so although we have different hashcode,each Entry object will be stored at 15 index only.

What if we  use hash function:

If we use hash function then :
On each hashcode, first hash will  be applied.

31=00000000000000000000000000011111 => 00000000000000000000000000011110
63=00000000000000000000000000111111  => 00000000000000000000000000111100
95=00000000000000000000000001011111 => 00000000000000000000000001011010

now after passing new hash to indexFor.It will return :
00000000000000000000000000011110 =>1110=14
00000000000000000000000000111100 =>1100=12
00000000000000000000000001011010 =>1010=10

After applying hash function, above hashcodes are returning different index so hash function is redistributing elements in hashmap thus less collisions and better performance.
The main purpose of hash operation is to make the hashcode differences visible in the least significant bits so that the hashmap elements can be distributed evenly across the buckets

Note:

  • If two key objects have same hashcode, then they will always go to same index in table array
  • If two key objects do not have same hashcode then they may or may not go to same index in table array.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *