Table of Contents
Java HashMap:
- HashMap in java
- How HashMap works in java
- hash and indexfor method in HashMap
- hashcode and equals method in java
- How to sort HashMap by keys and values
- Difference between HashMap and HashSet
- Difference between HashMap and Hashtable
- How to iterate over HashMap
Target AudienceThis post is for the people who already have good understanding of how HashMap works in java and want to understand more about hash and indexFor method.
In this post, we will see how hash and indexFor method works internally in hashmap. hash and indexFor methods belong to HashMap class. Why JDK developers need to have another hash function when key object’s have there own hashcode method.
Lets see code for hash and indexFor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
/** * Applies a supplemental hash function to a given hashCode, which * defends against poor quality hash functions. This is critical * because HashMap uses power-of-two length hash tables, that * otherwise encounter collisions for hashCodes that do not differ * in lower bits. Note: Null keys always map to hash 0, thus index 0. */ static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); } /** * Returns index for hash code h. */ static int indexFor(int h, int length) { return h & (length-1); } |
As java doc says about hash:
Applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions. This is critical because HashMap uses power-of-two length hash tables, that otherwise encounter collisions for hashCodes that do not differ in lower bits. Note: Null keys always map to hash 0, thus index 0.
Lets understand this with the help of example: Lets say key object’s hashcode() returns only 3 values 31,63 and 95. 31,63 and 95 are integers, so all are 32 bits.
31=00000000000000000000000000011111
63=00000000000000000000000000111111
95=00000000000000000000000001011111
Now lets say our hashmap’s lenth is 16(2^4, Hashmap’s length will be always in power of two)
What if we do not use hash function:
31=00000000000000000000000000011111 => 1111=15
63=00000000000000000000000000111111Â => 1111=15
95=00000000000000000000000001011111 => 1111=15
Why so?
because when we call indexFor function.It will do AND between 31&15 , 63&15 and 95&15.
For ex. 95&15
00000000000000000000000001011111 &00000000000000000000000000001111
so (2^n-1) will always have sequence of 1’s in it.so what matters here it last n bits as 0&1 is always 0.
In above example, all have 1111 in end that is why they are returning same index.
so although we have different hashcode,each Entry object will be stored at 15 index only.
What if we use hash function:
31=00000000000000000000000000011111 => 00000000000000000000000000011110
63=00000000000000000000000000111111Â => 00000000000000000000000000111100
95=00000000000000000000000001011111 => 00000000000000000000000001011010
now after passing new hash to indexFor.It will return :
00000000000000000000000000011110 =>1110=14
00000000000000000000000000111100 =>1100=12
00000000000000000000000001011010 =>1010=10
After applying hash function, above hashcodes are returning different index so hash function is redistributing elements in hashmap thus less collisions and better performance.
The main purpose of hash operation is to make the hashcode differences visible in the least significant bits so that the hashmap elements can be distributed evenly across the buckets
Note:
- If two key objects have same hashcode, then they will always go to same index in table array
- If two key objects do not have same hashcode then they may or may not go to same index in table array.