Unit 28: Immutability
After this unit, students should:
- be able to create an immutable class
So far in this course, we have been focusing on three ways of dealing with software complexity: by encapsulating and hiding the complexity behind abstraction barriers, by using a language with a strong type system and adhering to the subtyping substitution principle, and applying the abstraction principles and reusing code written as functions, classes, and generics types.
Another useful strategy to reduce bugs when code complexity increases is to avoid change altogether. This can be done by making our classes immutable. We create an instance of an immutable class, the instance cannot have any visible changes outside its abstraction barrier. This means that every call of the instance's method must behave the same way throughout the lifetime of the instance.
There are many advantages of why we want to make our class immutable when possible. To start, let's revisit a common bug due to aliasing. Recall the following example from Unit 9, where we create two circles c1
and c2
centered at the origin (0, 0).
1 2 3 |
|
Let's say that we have the moveTo
method in both Circle
and Point
, to move the circle and point respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Suppose we want to move c1
and only c1
to be centered at (1,1).
1 |
|
The line of code above surprisingly moved the center of both c1
and c2
, due to both circles c1
and c2
sharing the same point. We have explored a solution below:
1 2 3 4 5 6 7 |
|
This approach avoids sharing references by creating copies of our points so that no two references point to the same instance, avoiding aliasing altogether. This partial fix, however, comes with extra costs in computational resources as the number of objects may proliferate.
This is also not a complete solution because surprisingly, we can move c2
without calling c2.moveTo(1, 1)
but by calling the code below.
1 |
|
Let's now see how immutability can help us resolve our problem.
Immutable Points and Circles
Let's start by making our Point
class immutable. We start by making the fields final
to signal our intention that we do not intend to assign another value to them. Now that the x
and y
cannot be re-assigned (a new value or even the same value), to move a point, we shouldn't re-assign to the fields x
and y
anymore. Instead, we return a new Point
instance to prevent mutating the current instance, as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Note that, to avoid (likely malicious or ignorant) subclasses of Point
overriding the methods to make it appears that the point has mutated, it is recommended that we declare immutable classes as final
to disallow inheritance.
Now, let's make Circle
immutable:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
With both Point
and Circle
immutable, we can be sure that once an instance is created, it remains unchanged (outside the abstraction barrier):
1 2 3 4 |
|
To update the variable c1
, we need to explicitly reassign it.
1 |
|
Now, c1
moves to a new location, but c2
remains unchanged.
Compare our new immutable approach to the two approaches above. The first shares all the references and is bug-prone. The second creates a new copy of the instance every time and is resource-intensive. Our third approach, using immutable classes, allows us to share all the references until we need to modify the instance, in which case we make a copy. Such a copy-on-write semantic allows us to avoid aliasing bugs without creating excessive copies of objects.
Note that the final
keyword prevents assigning new value to the field. Unfortunately, it does not prevent the field from being mutated. So, to ensure that the classes we create are immutable, we have to ensure that the fields are themselves immutable.
Advantages of Being Immutable
We have seen how making our classes immutable helps us remove the risk of potential bugs when we use composition and aliasing. Immutability has other advantages as well.
Ease of Understanding
Code written with immutable objects is easier to reason with and easier to understand. Suppose we create a Circle
and assign it to a local variable:
1 |
|
We pass c
around to many other methods. These other methods may invoke c
's methods; we may invoke c
's methods locally as well. But, despite putting c
through so much, unless we have explicitly re-assigned c
, we can guarantee that c
is still a circle centered at (0,0) with a radius of 8. This immutable property makes it significantly easier to read, understand, and debug our code.
Without this property, we have to trace through all the methods that we pass c
to, and each call of c
's methods to make sure that none of these codes modifies c
.
Enabling Safe Sharing of Objects
Making a class immutable allows us to safely share instances of the class and therefore reducing the need to create multiple copies of the same object. For instance, the origin (0, 0) is commonly used. If the instance is immutable, we can just create and cache a single copy of the origin, and always return this copy when the origin is required.
Let modify our Point
class so that it creates a single copy of the origin and returns the same copy every time the origin is required.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
We made a few changes in the above:
- We made the constructor for
Point
private so that one cannot call the constructor directly. - We provide a class factory method named
of
for the client to create aPoint
instance. Theof
method returns the same instanceORIGIN
every timePoint.of(0, 0)
is called.
Such a design pattern is only safe when the class is immutable. Consider the mutable version of Point
-- calling Point.of(0, 0).moveTo(1, 1)
would change every reference to the origin to (1, 1), causing chaos in the code!
Enabling Safe Sharing of Internals
Immutable instances can also share their internals freely. Consider an immutable implementation of our Array<T>
, called ImmutableArray<T>
. Let's start with a simple version first.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
There are a few things to note here.
Varargs The parameter to the class factory method of
has the form T... items
. The triple .
notation is a Java syntax for a variable number of arguments of the same type (T
). Often called varargs, this is just syntactic sugar for passing in an array of items to a method. The method is called variadic method. We can then call of
with a variable number of arguments, such as:
1 2 3 4 |
|
@SafeVarargs. Since the varargs is just an array, and array and generics do not mix well in Java, the compiler would throw us an unchecked warning. In this instance, however, we know that our code is safe because we never put anything other than items of type T
into the array. We can use the @SafeVarargs
annotation to tell the compiler that we know what we are doing and this varargs is safe.
Notice that we removed the set
method and there is no other way an external client can modify the array once it is created. This, of course, assumes that we will only be inserting an immutable object into our immutable array. Unfortunately, this cannot be enforced by the compiler as the generic type T
can be anything.
Now, suppose that we wish to support a subarray
method, that returns a new array containing only a range of elements in the original array. It behaves as follows:
1 2 3 4 5 |
|
A typical way to implement subarray
is to allocate a new T[]
and copy the elements over. This operation can be expensive if our ImmutableArray
has millions of elements. But, since our class is immutable and the internal field array
is guaranteed not to mutate, we can safely let b
and c
refer to the same array
from a
, and only store the starting and ending index.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Enabling Safe Concurrent Execution
We will explore concurrent execution of code towards the end of the module, but making our classes immutable goes a long way in reducing bugs related to concurrent execution. Without going into details here (you will learn the details later), concurrent programming allows multiple threads of code to run in an interleaved fashion, in an arbitrary interleaving order. If we have complex code that is difficult to debug to begin with, imagine having code where we have to ensure its correctness regardless of how the execution interleaves! Immutability helps us ensure that regardless of how the code interleaves, our objects remain unchanged.
Final ≠ Immutable
When creating an immutable class, we need to be careful to distinguish between the keywords that helps us avoid accidentally making things easily mutable and the actual concept of immutable class. For instance, it is insufficient to simply declare all fields with final
keywords. Just because we cannot accidentally update the field, does not mean that the field is immutable. Consider the same Circle
above but with a getter for the center point and now imagine that the Point
is mutable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
We can then simply retrieve the center point and mutate it externally.
1 2 |
|
On the other hand, it is not even necessary to use the final
keyword to make an immutable class. We simply have to have a class that prevents any and all kinds of sharing by copying all the parameters before assigning them into the fields and copying all return value. Assume that all classes has a correctly implemented clone()
method. Then the following Circle
is immutable even with getter and no final
keyword on the fields. We still need the final
keyword on the class to disallow inheritance.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
That is not to say that the final
keyword is not important. It helps accidental re-assignment and in some cases that is sufficient especially if the fields are of primitive type. Once we have created one immutable class, we can then create other larger immutable classes by only using immutable classes as fields.