java.lang.Object 的 hashCode 方法 注意事项

hashCode: digests the data stored in an instance of the class into a single hash value (a 32-bit signed integer);
Technically, in Java, hashCode() by default is a native method, it is implemented directly in the native code in the JVM.

主要用在 hashTable, hashSet, HashMap等需要hash值的数据结构(hash-based collections), 有了这种离散分布的hash值, 对比操作更快;

基本的要求规范是(The hashCode contract):
1) 一个对象如果没有变化, 那么它返回的hashCode值必须前后一致;
2) 两个对象如果是 equals = true, 那么他们的hashCode 值必须一样;

没有要求说, 不同的JVM实现, 必须返回一致的hashCode值. 即使同一段程序, 在不同的执行(Run)当中, 也可以返回不同的hashCode值;
不同的对象, 返回同样的hash值, 是可以接受的(当然最好(理想)的情况下, 是返回不一样的hashCode值);
返回同样 hashCode值的对象, 不要求必须 equals;

为什么: 两个对象如果是 equals = true, 那么他们的hashCode 值必须一样?

举一个反例: 假如2个对象实例是equals的, 但是hashCode值不一样, 那么把它们两个同时放到一个hashSet, 那么它们就占用2个位置了, 但是因为它们是equals的, 所以我们认为它们应该在这个hashSet中只占一个位置, 是同一个值.

一个重写的例子 (奇数->更好的离散型):
An object’s hashCode method must take the same fields into account as its equals method.

public class Employee {
    int        employeeId;
    String     name;
    Department dept;
 
    // other methods would be in here 
 
    @Override
    public int hashCode() {
        int hash = 1;
        hash = hash * 17 + employeeId;
        hash = hash * 31 + name.hashCode();
        hash = hash * 13 + (dept == null ? 0 : dept.hashCode());
        return hash;
    }
}

HashCode collisions

Whenever two different objects have the same hash code, we call this a collision. A collision is nothing critical, it just means that there is more than one object in a single bucket, so a HashMap lookup has to look again to find the right object. A lot of collisions will degrade the performance of a system, but they won’t lead to incorrect results.
But if you mistake the hash code for a unique handle to an object, e.g use it as a key in a Map, then you will sometimes get the wrong object. Because even though collisions are rare, they are inevitable. For example, the Strings "Aa" and "BB" produce the same hashCode: 2112. Therefore, Never misuse hashCode as a key

there’s one important detail in the hashCode contract that can be quite surprising: hashCode does not guarantee the same result in different executions.

Moreover, you should be aware that the implementation of a hashCode function may change from one version to another. Therefore your code should not depend on any particular hash code values. For example, your should not use the hash code to persist state. Next time you run the application, the hash codes of the “same” objects may be different.

java.lang.String 的hashCode 实现

public int hashCode() {
    if (hashCode == 0) {
        int hash = 0, end = offset + count;
        for (int i = offset; i < end; i++) {
            hash = (hash << 5) - hash + value[i];
        }
        hashCode = hash;
    }
    return hashCode;
}

参考:

  1. wiki
  2. The 3 things you should know about hashCode()

标签: none

添加新评论