Sunday, August 19, 2007

Difference between equals and hashCode method


The Java super class java.lang.Object has two very important methods defined in it. They are -
  • public boolean equals(Object obj)
  • public int hashCode()
These methods prove very important when user classes are confronted with other Java classes, when objects of such classes are added to collections etc.

public boolean equals(Object obj)

This method checks if some other object passed to it as an argument is equal to the object on which this method is invoked. The default implementation of this method in Object class simply checks if two object references x and y refer to the same object. i.e. It checks if x == y. This particular comparison is also known as "shallow comparison". However, the classes providing their own implementations of the equals method are supposed to perform a "deep comparison"; by actually comparing the relevant data members. Since Object class has no data members that define its state, it simply performs shallow comparison.


This is what the JDK 1.4 API documentation says about the equals method of Object class-

Indicates whether some other object is "equal to" this one.
    The equals method implements an equivalence relation:
  • It is reflexive: for any reference value x, x.equals(x) should return true.
  • It is symmetric: for any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.
  • For any non-null reference value x, x.equals(null) should return false.
The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any reference values x and y, this method returns true if and only if x and y refer to the same object (x==y has the value true).

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

Here are some useful guidelines for implementing the equals method correctly.
  1. Use the equality == operator to check if the argument is the reference to this object, if yes. return true. This saves time when actual comparison is costly.
  2. Use the following condition to check that the argument is not null and it is of the correct type, if not then return false.
    if((obj == null) || (obj.getClass() != this.getClass())) return false;
    Note that, correct type does not mean the same type or class as shown in the example above. It could be any class or interface that one or more classes agree to implement for providing the comparison.
  3. Cast the method argument to the correct type. Again, the correct type may not be the same class. Also, since this step is done after the above type-check condition, it will not result in a ClassCastException.
  4. Compare significant variables of both, the argument object and this object and check if they are equal. If *all* of them are equal then return true, otherwise return false. Again, as mentioned earlier, while comparing these class members/variables; primitive variables can be compared directly with an equality operator (==) after performing any necessary conversions (Such as float to Float.floatToIntBits or double to Double.doubleToLongBits). Whereas, object references can be compared by invoking their equals method recursively. You also need to ensure that invoking equals method on these object references does not result in a NullPointerException
  5. It is neither necessary, nor advisable to include those class members in this comparison which can be calculated from other variables, hence the word "significant variables". This certainly improves the performance of the equals method. Only you can decide which class members are significant and which are not.
  6. Do not change the type of the argument of the equals method. It takes a java.lang.Object as an argument, do not use your own class instead. If you do that, you will not be overriding the equals method, but you will be overloading it instead; which would cause problems. It is a very common mistake, and since it does not result in a compile time error, it becomes quite difficult to figure out why the code is not working properly.
  7. Review your equals method to verify that it fulfills all the requirements stated by the general contract of the equals method.
  8. Lastly, do not forget to override the hashCode method whenever you override the equals method, that's unpardonable. ;)


public int hashCode()

This method returns the hash code value for the object on which this method is invoked. This method returns the hash code value as an integer and is supported for the benefit of hashing based collection classes such as Hashtable, HashMap, HashSet etc. This method must be overridden in every class that overrides the equals method.

This is what the JDK 1.4 API documentation says about the hashCode method of Object class-

Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
    The general contract of hashCode is:
  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

  1. Consistency during same execution - Firstly, it states that the hash code returned by the hashCode method must be consistently the same for multiple invocations during the same execution of the application as long as the object is not modified to affect the equals method.
  2. Hash Code & Equals relationship - The second requirement of the contract is the hashCode counterpart of the requirement specified by the equals method. It simply emphasizes the same relationship - equal objects must produce the same hash code. However, the third point elaborates that unequal objects need not produce distinct hash codes.
After reviewing the general contracts of these two methods, it is clear that the relationship between these two methods can be summed up in the following statement -

Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.






Tips

  • Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
  • The equals method provides "deep comparison" by checking if two objects are logically equal as opposed to the "shallow comparison" provided by the equality operator ==.
  • However, the equals method in java.lang.Object class only provides "shallow comparison", same as provided by the equality operator ==.
  • The equals method only takes Java objects as an argument, and not primitives; passing primitives will result in a compile time error.
  • Passing objects of different types to the equals method will never result in a compile time error or runtime error.
  • For standard Java wrapper classes and for java.lang.String, if the equals argument type (class) is different from the type of the object on which the equals method is invoked, it will return false.
  • The class java.lang.StringBuffer does not override the equals method, and hence it inherits the implementation from java.lang.Object class.
  • The equals method must not provide equality comparison with any built in Java class, as it would result in the violation of the symmetry requirement stated in the general contract of the equals method.
  • If null is passed as an argument to the equals method, it will return false.
  • Equal hash codes do not imply that the objects are equal.
  • return 1; is a legal implementation of the hashCode method, however it is a very bad implementation. It is legal because it ensures that equal objects will have equal hash codes, it also ensures that the hash code returned will be consistent for multiple invocations during the same execution. Thus, it does not violate the general contract of the hashCode method. It is a bad implementation because it returns same hash code for all the objects. This explanation applies to all implementations of the hashCode method which return same constant integer value for all the objects.
  • In standard JDK 1.4, the wrapper classes java.lang.Short, java.lang.Byte, java.lang.Character and java.lang.Integer simply return the value they represent as the hash code by typecasting it to an int.
  • Since JDK version 1.3, the class java.lang.String caches its hash code, i.e. it calculates the hash code only once and stores it in an instance variable and returns this value whenever the hashCode method is called. It is legal because java.lang.String represents an immutable string.
  • It is incorrect to involve a random number directly while computing the hash code of the class object, as it would not consistently return the same hash code for multiple invocations during the same execution.

No comments:

Topics