Some are more equal than others
"And to avoid the tedious repetition of these words: is equal to: I will set as I do often in work use, a pair of parallels, or Gemowe lines of one length, thus: =, because no 2 things, can be more equal." -- Robert Recorde, The Whetstone of Witte (1557)
How does the code work?
You might have seen a similar snippet before, usually followed by a good advice to always check for equality of objects using equals()
instead of ==
. But what really happens inside the code?
% java Equality
a1 == a2
Looks like for some reason Java doesn't want to acknowledge that 200 is equal to 200, even though it doesn't have any problem with 100 being equal to 100. Now what happens when we change Integer
s to int
s?
int a1 , a2 ;
int b1 , b2 ;
if System.out.;
if System.out.;
Surprisingly, everything works fine this time:
% java Equality2
a1 == a2
b1 == b2
Following the good advice, let's change the code to use equals()
.
Integer a1 , a2 ;
Integer b1 , b2 ;
if System.out.;
if System.out.;
The result is also correct this time:
% java Equality3
a1 equals a2
b1 equals b2
So what happens under the hood that makes 200 not equal to 200? Let's take a closer look at our Integer
s.
Integer a1 , a2 ;
Integer b1 , b2 ;
for
System.out.;
This piece of code gives us some more insight about the Integers:
% java Equality4
100 -> hash 100, id hash 1265094477
100 -> hash 100, id hash 1265094477
200 -> hash 200, id hash 2125039532
200 -> hash 200, id hash 312714112
The first observation is that the Integers hash to themselves, but that's pretty boring. A more interesting realization is that a1
and a2
point to the same object, while b1
and b2
are distinct.
Why is it so?
The Integer Cache and autoboxing
Introduced in Java 5, the Integer cache's main goals are to improve Integer object performance and to reduce the memory footprint. The idea behind the mechanism is to cache a small number of Integers internally and reuse them.
Autoboxing and autounboxing, the concepts also introduced in Java 5, stand for automatic conversions between the primitive types and the corresponding object wrappers. Let's have a quick look at how these work:
Integer a1 ;
With autoboxing, the compiler actually replaces that line of code with:
Integer a1 ;
Autounboxing works in a similar way:
Integer a1 ;
int p ;
// actually does this:
int p ;
We're just one step away from solving the mystery. Let's have a look at Integer.valueOf()
now:
Returns an
Integer
instance representing the specifiedint
value. If a newInteger
instance is not required, this method should generally be used in preference to the constructorInteger(int)
, as this method is likely to yield significantly better space and time performance by caching frequently requested values. This method will always cache values in the range -128 to 127, inclusive, and may cache other values outside of this range.
Indeed, looking under the hood we can see that the Integer
has an inner private class, IntegerCache
, that stores copies of Integers, by default those with values from -128 to 127 in an array. It is used by valueOf(int)
to avoid the creation of new objects when unnecessary. We also see that the upper bound is configurable by the -XX:AutoBoxCacheMax=n
option.
Knowing all this, let's go back to the the code. We will print the hashcodes of the objects, then use ==
for checking equality:
Integer a1 , a2 ;
Integer b1 , b2 ;
for
System.out.;
if System.out.;
if System.out.;
Time to run the code again, passing the appropriate parameter to the VM:
% java -XX:AutoBoxCacheMax=200 Equality5
100 -> hash 100, id hash 1252169911
100 -> hash 100, id hash 1252169911
200 -> hash 200, id hash 2101973421
200 -> hash 200, id hash 2101973421
a1 == a2
b1 == b2
As expected, because we bumped the upper bound of the integer cache to 200, both b1
and b2
are now served from the cache, making the code produce expected results.
Fun with IntegerCache: 2 + 2 = 5
By adding reflections to the mix, we can access and modify the integer cache from the code, making Java do unexpected things:
;
After making such modification to the integer cache, Java will claim that:
% java IntegerCacheFun
2 + 2 = 5
With the introduction of JPMS in Java 9, it's not so much fun anymore -- from this version on, accessing something we're not supposed to gives a warning:
% java IntegerCacheFun
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by IntegerCacheFun (file:/tmp/java-eq/) to field java.lang.Integer$IntegerCache.cache
WARNING: Please consider reporting this to the maintainers of IntegerCacheFun
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2 + 2 = 5
As you can see, such access is still allowed, to help with transition. It will be forbidden by default in the future, causing an InaccessibleObjectException
. You can trigger this behaviour by passing --illegal-access=deny
to the JVM, as follows:
% java --illegal-access=deny IntegerCacheFun
Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make field static final java.lang.Integer[] java.lang.Integer$IntegerCache.cache accessible: module java.base does not "opens java.lang" to unnamed module @355da254
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:340)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:280)
at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:176)
at java.base/java.lang.reflect.Field.setAccessible(Field.java:170)
at IntegerCacheFun.main(IntegerCacheFun.java:7)
An exercise for the reader
Which other Java classes can be abused in such way to produce wrong results?