michael.gr: Immutability Assessment

In languages like Java and C#

Abstract

The need is identified for programmatically ascertaining the immutability of certain objects used in situations where they are expected to be immutable. The technicalities of immutability assessment are discussed. A mechanism is described for achieving it.

(Useful pre-reading: About these papers)

The Problem

Raise your hand if you have ever had to troubleshoot a bug that manifested itself in mysterious ways, defied rational explanation, tenaciously evaded detection, made you rage at the absurdity of the observed behavior, and after much weeping and wailing and gnashing of teeth, turned out to be due to one of the following reasons:

1. Inadvertently mutating an object that has been added as a key in a hash map.

2. Inadvertently mutating an object that has been passed to another thread.

in general:

3. One piece of code mutating an object that another piece of code groundlessly assumes that it remains unchanged.

These mishaps of course happen due to the fact that the objects involved should have been immutable, but they were not. If an object is immutable, nobody can mutate it, and therefore nobody has to assume that it will not change.

So, could hash maps somehow require that their keys be immutable? Could threads somehow require that objects shared among them be immutable?

This leads us to the more general question of how to ascertain immutability, which is certainly not an easy task. Most programmers don't even consider it; few talk about it; even fewer attempt it. Programmers all over the world are accustomed to routinely using objects in situations where immutability is an absolute requirement, but without ever ascertaining it, essentially praying that the objects be immutable.

Compiler-Enforced Immutability

Inadvertent mutation is not a problem in purely functional programming languages, where there simply is no such thing as mutation. However, most programmers do not use such languages, because they are cumbersome to work with. Most programmers use languages like Java and C#, which are not purely functional, so they allow mutation, and so inadvertent mutation can sometimes happen.

Java and C# do support a few constructs for defining invariable (final/readonly) class members, but they are woefully inadequate. Systematic compiler support for declaring and requiring immutability would greatly help to reduce the volume of mistakes being made, but nothing like that exists, and even if it did exist, it would not be a panacea, because there are situations where the compiler cannot help.

Since compiler-enforced immutability is not available, we have to enforce it ourselves, which means that we have to programmatically detect immutability and ascertain it.

Languages like Java and C# offer full reflection support, so we can examine every field of every type, (static analysis,) and we can even examine the values of fields of instances. (Dynamic analysis.) Furthermore, these languages compile into intermediate code, which is relatively easy to parse and reason about, meaning that we can even analyze executable code if we want to. (More static analysis.)

So, the question is what to analyze, and how.

Superficial vs. Deep Immutability

Many classes have the term "immutable" in their name, but they are only superficially immutable. Take a generic immutable collection or example: `ImmutableCollection<T>`. Let us trust that it does in fact behave perfectly immutably, and therefore it does, arguably, deserve to be called immutable; let us now ask: would an instance of this class be safe to pass to another thread? The answer is that it depends on the actual type of the generic parameter: If `T` is immutable, it is safe; but if `T` is mutable, then it is absolutely not safe.

So, in order to reap any benefits whatsoever from immutability, it must be deep immutability. Shallow immutability is irrelevant. Please keep this in mind, as it has severe implications in our quest to ascertain the immutability of anything.

Static Analysis

The term "static analysis" refers to examining the code that makes a program, (as written, or as compiled,) but not the state of the program as it runs. Consequently, static analysis can examine the definitions of data structures, but not the actual contents of those data structures during runtime.

A popular but naive understanding of immutability is that it is an inherent characteristic of types, and that the instances of the types (i.e. the objects) simply follow suit. According to this understanding, all we need to do is to ascertain that a certain type is immutable, and from that moment on we know that all of its instances are immutable.

This understanding is not entirely false, but it is very limiting, because it means that only concrete and non-extensible (a.k.a. final, sealed) types can potentially be assessed as immutable: All interfaces must necessarily be considered as mutable, because we have no idea how they may be implemented, and all abstract or simply extensible types must also necessarily be considered as mutable, because we have no idea how they may be extended.

This poses an insurmountable problem if we wanted to have, say, a queue for exchanging messages between threads, where the messages are organized in a class hierarchy: such a queue would not be able to ascertain the immutability of the messages it handles, because all it knows is the base-most 'Message' class, which is necessarily extensible, and therefore mutable, as far as static analysis can tell.

Now, consider that many perfectly immutable classes tend to be passed around as interfaces, (e.g. `Comparer`, `Hasher`, `Predicate`, all sorts of stateless converters, etc.) that these interfaces are often stored in fields, and that a field of mutable type makes the class containing that field also mutable. It quickly becomes evident that static analysis can only work in a universe where no abstraction is utilized; however, we do not live in such a universe: we make use of languages like Java and C# precisely because we want the benefits of unlimited abstraction.

One final nail in the coffin of static analysis is the issue of delayed immutability.

Delayed Immutability

Some objects begin life as mutable, so that they can undergo some non-trivial initialization, and become immutable later, once initialization is complete. This behavior is necessary when creating cyclic graphs of immutable objects, or when creating an immutable object while loading its contents from some external storage. (Alternative terms used by others for this kind of immutability are Freezing and Popsicle immutability.)

There is no standard way of representing delayed immutability, so let me propose one real quick:

Let there be a `SelfAssessing` interface, which is to be implemented by any class that utilizes delayed immutability. This interface is to have just one method, `IsImmutable()`, which is expected to return `false` for as long as the object is mutable, and to start returning `true` once the object becomes immutable.

Note that static analysis is by nature limited to examining types, but delayed immutability requires invoking a method of an instance of a type. Thus, static analysis completely fails to assess delayed immutability. Furthermore, a delayed immutable may appear as a field in any type, meaning that static analysis fails to assess potentially any type.

Since static analysis fails in the presence of abstraction and/or delayed immutability, it follows that we have to examine not just types, but also the instances of types in the running software system. This calls for dynamic analysis.

Dynamic Analysis

The term "dynamic analysis" refers to examining various aspects of a software system as it runs. In some cases the aim is to examine the behavior of the software, in other cases (such as the case at hand) it is to examine the data structures it creates. Dynamic analysis may require (and in the case at hand it does require) static analysis as a prerequisite.

With dynamic analysis we can look past the advertised type of a field, which may be abstract, and obtain the instance stored in the field, (the value of the field,) in order to find out the actual, concrete type of that instance.

Once we have the concrete type of an instance, we can assess whether it is immutable, and this may involve recursively assessing any instances referenced by that instance. If everything is immutable, then and only then can the containing instance assessed as immutable.

To make all of this work, we begin with static analysis where we use reflection to examine a type with the goal of giving it one of three possible assessments:

Mutable
Immutable
Inconclusive

These type assessments are issued as follows:

The mutable type assessment is issued if:

The type has any fields that are variable, (non-final/non-readonly,) because such fields are mutable no matter what their advertised type (field type) is.
The type has nothing but invariable fields, but one or more of them is of an advertised type that has received a mutable assessment, because this means that the containing type is not deeply immutable.

The immutable type assessment is issued if a type consists exclusively of fields that are both invariable and of an advertised type which has received an immutable assessment.
The inconclusive type assessment is issued if:

The type is abstract or extensible (non-final/non-sealed.)
The type is self-assessing.
The type contains any fields of an advertised type that has in turn received an inconclusive assessment.

Note that the above are type assessments, issued on types, by static analysis alone.

Every instance of a type that has received a mutable or immutable assessment is in turn mutable or immutable without the need to examine the contents of the instance; however, every instance of a type that has received an inconclusive assessment must be further examined to issue a final assessment for that instance only.

The value of each field must be obtained from the instance, and assessment must recursively be applied on that value.
If the type is self-assessing, then the `IsImmutable()` method must be invoked on the instance, to ask it whether it is immutable or not.

Both type assessment and instance assessment can be expensive; however, note the following:

Once a type assessment has been issued, it will never change, so it can be cached, and never recomputed again.
Instance assessments can be requested only from within assertions, meaning that they can incur zero runtime overhead on production.

Note that for static analysis we employed nothing but reflection to examine the fields of a type, and for dynamic analysis we also employed nothing but reflection to examine the values of fields of instances, so no code analysis was necessary. However, for the sake of completeness, let us also take a brief look at code analysis.

Code analysis

There is a school of thought according to which the answer to the immutability assessment question lies in analyzing the executable instructions that comprise a type to determine whether any fields are mutated by code outside of the constructor.

The problem with code analysis is that it is a form of static analysis, so it suffers from the disadvantages of static analysis that were previously explained.

Suppose that code analysis determines that a type does not mutate any fields outside of its constructor; suppose, however, that the type contains a field of abstract type, which gets initialized from a constructor parameter; is this type mutable or immutable? Obviously, it depends on the concrete type of the instance that will be stored, at runtime, in that field. So, we are back at square one, where static analysis simply does not work in the face of abstraction. Therefore, code analysis is not the answer.

Code analysis could potentially be useful, as a supplement to dynamic analysis, in the following ways:

In some cases, a type contains a field which is written by a method other than the constructor. For this to work, the field has to be variable. (Non-final/non-readonly.) Thus, with the use of reflection alone, this type will be assessed as mutable. However, it may be that the method which writes the field makes sure that the field is only written once during the lifetime of the instance, and that it gets written before it is ever read, so it will never appear to mutate as far as external observers can tell. Thus, the type is effectively immutable. It is in theory possible (though not easy) for code analysis to detect that the field is treated in this way, thus allowing the type to be assessed as immutable.
Sometimes a type contains fields that are only written by the constructor, but the programmer who wrote that type forgot to declare them as invariable (final/readonly) and did not pay attention to the warnings / inspections / analysis messages. If we were to only use reflection, these fields would be considered variable, so the type would in turn be assessed as mutable. Code analysis can detect that the fields are not written outside of the constructor, allowing them to be assessed as invariable, and therefore the type to be assessed as immutable.

Preassessment

There exist types that would normally receive a mutable assessment, but we know for sure that they are practically immutable. A famous example of such a type, both in Java and in C#, is class `String`. In such cases, we must be able to preassess the type as immutable, which means to assign an immutable assessment to the type, without analyzing it.

Note that preassessment constitutes a promise, and promises can be false. If a type which is actually mutable is mistakenly preassessed as immutable, bad things are bound to happen.

Generic Shallow Preassessment

Some generic types are effectively immutable containers. In Java, which uses type erasure, these are essentially containers of elements of type `object`, so they are by definition inconclusive; however, in C# the type of the generic type argument is known at runtime, so we do better than that. When a generic effectively immutable container type is constructed with an actual type parameter, the immutability of the resulting type depends on the immutability of that parameter:

If the generic type parameter is a mutable type, then the constructed generic container type is mutable, so instances of that type do not need to be assessed.
If the generic type parameter is an immutable type, then the constructed generic container type is immutable, so again, instances of that type do not need to be assessed.
If the generic type parameter is inconclusive, then the constructed generic container type is inconclusive, which means that for every instance of that type, all elements in the container must be assessed.

In order to be able to assess the elements of a container, the preassessment for the container must include an object known as a deconstructor. Dynamic analysis will be invoking the deconstructor to enumerate the elements contained within each instance of the container, so that each element can be assessed. Deconstructors are generally trivial:

The deconstructor for collections simply yields all the elements of the collection.
The deconstructor for maps/dictionaries simply yields all the mappings. (Map entries / key-value pairs.)
The deconstructor for `Lazy<T>` simply yields the one and only value contained within the lazy object.

Preassessment is mainly intended for types that have been defined by others, and thus we cannot modify their source code. For types that we write ourselves, we want a finer level of control: we want to be able to override the assessment of specific fields only, and allow all other fields to be assessed the normal way, to catch situations where we thought that some field was immutable, while in fact assessment of that field shows that it is not immutable. For that, we need field overrides.

Field Overrides

Sometimes a field is variable, but we want to promise that we will only vary it in an effectively immutable way. For such cases, there must be an annotation/attribute that we can attach to that field, to indicate that analysis should treat the field as invariable.

Array Field Overrides

Arrays are by definition mutable in Java and C#, and by extension so is any type that contains an array field, even if the field itself is invariable. If we want to be able to create an immutable type that contains an array field, there must be an annotation/attribute that we can attach to that array field, to indicate that analysis should treat the array itself as invariable.

Elucidation

Once we have immutability assessment working as described in the preceding sections, a new challenge becomes apparent: sometimes, a data structure that was intended to be immutable will be assessed as mutable due to some tiny programmer mistake. If the data structure is large and complex, it might not be obvious where the mistake is. The programmer will receive a mutable assessment, but will not know why it was given and where to look to find the problem.

For this reason, every mutable instance assessment must come with a sentence explaining to the programmer why the assessment was issued. Since every mutable instance assessment typically has one or more other assessments that are the reasons that led to it, these sentences will often form entire trees, each sentence being further explained by nested sentences.

I call this feature elucidation.

Appendix

Immutability assessment is awesome, but the more the compiler can do for us, the better.

Here are some examples of what compilers of (non-purely functional) programming languages could be doing for us in the direction of compiler-enforced immutability:

A language could support an 'immutable' class modifier, which would require the class to contain only immutable members. An immutable class may not extend a mutable class, and a mutable class may not extend an immutable class. (Although a mutable class may extend a class which has not been marked as immutable, even if that class happens to be immutable.)
A language could support an 'immutable' modifier for function arguments and for fields, requiring that they may only be assigned from concrete types that are immutable, or from other fields or function arguments that are also immutable.
A language could support an 'immutable' generic parameter constraint, which would mandate that only immutable types can be used as generic type arguments.
A language could support a 'stable' field modifier, allowing a mutable field to appear in an immutable class, and acting as a promise that the field will only be mutated in a way which upholds effective immutability.
A language could support a 'stable array' field modifier for array fields, allowing an array to appear in an immutable class, and acting as a promise that the contents of the array will either not be mutated, or they will only be mutated in a way which upholds effective immutability.
etc.

Cover image created by ChatGPT. The prompt used was: "Please give me a photographic quality image of a big diamond floating against a completely black background. Make it landscape."

Scratch

(Ignore)

- As it turns out, the mutability of value types is largely irrelevant, as explained here:
[Vladimir Sadov: "C# Tuples. Why mutable structs?"](http://mustoverride.com/tuples_structs/)

2025-06-04

Immutability Assessment