Monday, August 20, 2012

New Feature in Java SE 7 Update 6: Alternative Hash Function


Be prepared for the upcoming Java SE 8! From http://mail.openjdk.java.net/pipermail/jdk7u-dev/2012-July/003721.html we learn the following:

  • Java SE 7 (beginning with Update 6) and 8 both now support alternative hashing for String keys with hash based Maps
  • Alternative hashing improves performance when many String key hash codes collide
  • Alternative hashing impacts key, value and element iteration order
  • Alternative hashing is currently DISABLED by default for Java SE 7
  • Future Java SE 7 releases may enable alternative hashing for "large" (>512 capacity) maps
  • Developers can enable the feature in Java SE 7 for testing and deployment with a system property
  • Alternative hashing is ENABLED for all maps in Java SE 8
  • It will probably not be possible to disable alternative hashing in Java SE 8
  • Hash map key, value and element iteration order WILL be different and unpredictable in Java SE 8
  • Different implementation approaches are still being investigated for Java SE 8 and remain subject to change
In the Java SE 7 Update 6 Release Notes you can find the following section:

Alternative Hash Function

Starting from JDK 7u6, an important change is made to hash based Map implementations to improve performance. An alternative hashing function is made available to keys of type String.

Alternative hashing is DISABLED by default, by setting the system property, jdk.map.althashing.threshold value to "-1". To enable the alternative hash function, set the jdk.map.althashing.threshold system property to a different value. The recommended value is 512.

More details are described in the Collections Framework Enhancements in Java SE 7  of the Java SE 7 Update 6 package:

Improved Hash Function
Java SE 7u6 introduces an improved, alternative hash function for the following map and map-derived collection implementations:

The alternative hash function improves the performance of these map implementations when a large number of key hash collisions are encountered.
For Java SE 7u6, this alternative hash function is implemented as follows:
  • The alternative hash function is only applied to keys of type String.
  • The alternative hash function is only applied to maps with a capacity larger than a specified threshold size. By default, the threshold is -1. This value disables the alternative hash function. To enable the alternative hash function (which is only applied to keys of type String), set the jdk.map.althashing.threshold system property to a different value. The recommended value is 512. Setting this system property to 512 causes all maps with a capacity larger than 512 entries to use the alternative hash function. You can set this system property to 0, which causes all maps to use the alternative hash function.The following describes the jdk.map.althashing.thresholdsystem property in more detail:

    • Value type: Integer
    • Value default: -1
    • Value range: From -1 to 2147483647, inclusive
    • Description: Threshold capacity at which maps use the alternative hash function for keys of type String. The value -1 is a synonym for 2147483647 (which is easier to remember). All other values correspond to threshold capacity.
    For example, the following command runs the Java application MyApplication and sets the jdk.map.althashing.threshold system property to 512:

    java -Djdk.map.althashing.threshold=512 MyApplication
    
  • If the alternative hash function is being used, then the iteration order of keys, values, and entities vary for each instance of HashMap, Hashtable, HashSet, and ConcurrentHashMap. This change in iteration order may cause compatibility issues with some programs. This is the reason that the alternative hash function is disabled by default. Future Java SE versions and releases will probably enable the alternative hash function for keys of type String by default. If you are concerned about application performance or compatibility with future versions of Java SE, you should enable the alternative hash function when testing your applications.