2025-06-04

Immutability Assessment

In languages like Java and C#

Abstract

The need is identified for programmatically ascertaining the immutability of certain objects used in situations where they are expected to be immutable. The technicalities of immutability assessment are discussed. A mechanism is described for achieving it.

(Useful pre-reading: About these papers)

The Problem

Raise your hand if you have ever had to troubleshoot a bug that manifested itself in mysterious ways, defied rational explanation, tenaciously evaded detection, made you rage impotently at the absurdity of the observed behavior, and after much weeping and wailing and gnashing of teeth, turned out to be due to one of the following reasons:

1. Inadvertently mutating an object that has been added as a key in a hash map.

2. Inadvertently mutating an object that has been passed to another thread.

in general:

3. One piece of code mutating an object that another piece of code groundlessly assumes that it remains unchanged.

These mishaps of course happen due to the fact that the objects involved should have been immutable, but they were not. If an object is immutable, nobody can mutate it, and therefore nobody has to assume that it will not change.

So, could hash maps somehow require that their keys be immutable? Could threads somehow require that objects shared among them be immutable?

This leads us to the more general question of how to ascertain immutability.

Compiler-Enforced Immutability

Inadvertent mutation is not a problem in purely functional programming languages, where there simply is no such thing as mutation. However, most programmers do not use such languages, because they are cumbersome to work with. Most programmers use languages like Java and C#, which are not purely functional, so they allow mutation, and so inadvertent mutation can sometimes happen.

Java and C# do support a few constructs for defining invariable (final/readonly) class members, but they are woefully inadequate. Systematic compiler support for declaring and requiring immutability would greatly help to reduce the volume of mistakes being made, but nothing like that exists, and even if it did exist, it would not be a panacea, because there are situations where the compiler cannot help.

Since compiler-enforced immutability is not available, we have to enforce it ourselves, which means that we have to programmatically detect immutability and ascertain it.

Languages like Java and C# offer full reflection support, so we can examine every field of every type, (static analysis,) and we can even examine the values of fields of instances. (Dynamic analysis.) Furthermore, these languages compile into intermediate code, which is relatively easy to parse and reason about, meaning that if we want to, we can even analyze executable code. (More static analysis.)

So, the question is what to analyze, and how.

Superficial vs. Deep Immutability

Many classes have the term "immutable" in their name, but they are only superficially immutable. Take a generic immutable collection or example: `ImmutableCollection<T>`. Let us trust that it does in fact behave perfectly immutably, and therefore it does, arguably, deserve to be called immutable; let us now ask: would an instance of this class be safe to pass to another thread? The answer is that it depends on the actual type of the generic parameter: If `T` is immutable, it is safe; but if `T` is mutable, then it is absolutely not safe.

So, in order to reap any benefits whatsoever from immutability, it must be deep immutability. Shallow immutability is irrelevant. Please keep this in mind, as it has severe implications in our quest to ascertain the immutability of anything.

Immutability Assessment via Static Analysis

The term "static analysis" refers to examining the code, (as written, or as compiled,) but not the results of executing that code. Consequently, static analysis can examine the definitions of data structures, but not the actual contents of those data structures during runtime.

A popular but naive understanding of immutability is that it is an inherent characteristic of types, and that the instances of the types (i.e. the objects) simply follow suit. According to this understanding, all we need to do is to ascertain that a certain type is immutable, and from that moment on we know that all of its instances are immutable.

This understanding is not entirely false, but it is very limiting, because it means that only concrete and non-extensible (a.k.a. final, sealed) types can potentially be assessed as immutable: All interfaces must necessarily be considered as mutable, because we have no idea how they may be implemented, and all abstract or simply extensible types must also necessarily be considered as mutable, because we have no idea how they may be extended.

This poses an insurmountable problem if we wanted to have, say, a queue for exchanging messages between threads, where the messages are organized in a class hierarchy: such a queue would not be able to ascertain the immutability of the messages it handles, because all it knows is the base-most 'Message' class, which is necessarily extensible, and therefore mutable, as far as static analysis can tell.

Now, consider that many perfectly immutable classes tend to be passed around as interfaces, (e.g. `Comparer`, `Hasher`, `Predicate`, all sorts of stateless converters, etc.) that these interfaces are often stored in fields, and that a field of mutable type makes the class containing that field also mutable. It quickly becomes evident that static analysis can only work in a universe where no abstraction is utilized; however, this is not our universe: we use languages like Java and C# precisely because we want the benefits of unlimited abstraction.

One final nail in the coffin of static analysis is the issue of delayed immutability.

Delayed Immutability

Some objects begin life as mutable, so that they can undergo some non-trivial initialization, and become immutable later, once initialization is complete. This behavior is necessary when creating cyclic graphs of immutable objects, or when creating an immutable object while loading its contents from some external storage.

Alternative terms used by others for this kind of immutability are Freezing and Popsicle immutability.

There is no standard way of representing delayed immutability, so let me propose one real quick:

Let there be a `SelfAssessing` interface, which is to be implemented by any class that utilizes delayed immutability. This interface is to have just one method, `IsImmutable()`, which is expected to return `false` for as long as the object is mutable, and to start returning `true` once the object becomes immutable.

Note that static analysis is by nature limited to examining types, but delayed immutability requires invoking a method of an instance of a type. Thus, static analysis completely fails to assess delayed immutability. Furthermore, a delayed immutable may appear as a field in any type, meaning that static analysis fails to assess potentially any type.

From this it follows that if we want to have the slightest hope of assessing immutability, we have to examine not just types, but also the instances of types in the running software system. This calls for dynamic analysis.

Immutability Assessment via Dynamic Analysis

The term "dynamic analysis" refers to examining various aspects of a software system as it runs. In some cases the aim is to examine the behavior of the software, in other cases (such as the case at hand) it is to examine the data structures it creates. Dynamic analysis may require (and in the case at hand it does require) static analysis as a prerequisite.

As already explained, static analysis alone is woefully inadequate in the presence of abstraction, and in the presence of delayed immutability. As it turns out, for many fields we have to ignore the advertised type of the field, which is abstract, and instead obtain the instance stored in the field, (the value of the field,) in order to find out the actual, concrete type of that instance.

Once we have the concrete type of an instance, we can assess whether it is immutable, and this may involve recursively assessing any instances referenced by that instance. If everything is immutable, then and only then can the instance in question be assessed as immutable.

To make all of this work, we begin with static analysis where we use reflection to examine a type with the goal of giving it one of three possible assessments:

  • Mutable
  • Immutable
  • Inconclusive

The mutable assessment is issued for a type if it has any fields that are variable, (non-final/non-readonly,) because such fields are mutable no matter what their advertised type (field type) is. It is also issued if the type has any fields with an advertised type that has received a mutable assessment, because this means that the type in question is not deeply mutable.

The immutable assessment is issued for a type if it consists exclusively of fields that are both invariable and of an advertised type which has received an immutable assessment.

The inconclusive assessment is issued for a type if it is self-assessing, if it is abstract, if it is extensible, (non-final/non-sealed,) or if it contains any fields of an advertised type that has in turn received an inconclusive assessment.

Note that the above are type assessments, issued by static analysis alone.

Every instance of a type that has received a mutable or immutable assessment is in turn mutable or immutable without the need to examine the contents of the instance; however, every instance of a type that has received an inconclusive assessment must be further examined to issue a final assessment for that instance only:

  • The value of each field must be obtained from the instance, and assessment must recursively be applied on that value.
  • If the type is self-assessing, then the `IsImmutable()` method of that interface must be invoked on the instance, to ask it whether it is immutable or not.

Both type assessment and instance assessment can be expensive; however, note the following:

1. Type assessments can be cached, so each type needs to be assessed only once;

2. Instance assessments can be used only from within assertions, meaning that they can incur zero runtime overhead on production runs.

    Note that for static analysis we employed nothing but reflection to examine the fields of a type, and for dynamic analysis we also employed nothing but reflection to examine the values of fields of instances, so no code analysis was necessary. However, for the sake of completeness, let us also take a brief look at code analysis.

    Code analysis

    There is a school of thought according to which the answer to the immutability assessment question lies in analyzing the executable instructions that comprise the methods of a type to determine whether any fields are mutated by code outside of the constructor.

    The problem with code analysis is that it is a form of static analysis, so it suffers from the disadvantages of static analysis that were previously explained.

    Suppose that code analysis determines that a type does not mutate any fields outside of its constructor; suppose, however, that the type contains a field of abstract type, which gets initialized from a constructor parameter; is this type mutable or immutable? Obviously, it depends on the concrete type of the instance that will be stored, at runtime, in that field. So, we are back at square one, where static analysis simply does not work in the face of abstraction.

    Code analysis could potentially be useful, as a supplement to dynamic analysis, in the following ways:

    • In some cases, a type contains a field which is written by a method other than the constructor. For this to work, the field has to be variable. (Non-final/non-readonly.) Thus, with the use of reflection the type is assessed as mutable. However, it may be that the method which writes the field makes sure that the field is only written once during the lifetime of the instance, and that it gets written before it is ever read, so it will never appear to mutate as far as external observers can tell. Thus, the type is effectively immutable. It is in theory possible (though not easy) for code analysis to detect that the field is treated in this way, thus allowing the type to be assessed as immutable.
    • Sometimes a type contains fields that are only written by the constructor, but the programmer who wrote that type forgot to declare them as invariable (final/readonly) and did not pay attention to the warnings. If we were to only use reflection, these fields would be considered variable, so the type would in turn be assessed as mutable. Code analysis can detect that the fields are not written outside of the constructor, allowing them to be assessed as invariable, and therefore the type to be assessed as immutable.

    Immutability Preassessment

    There exist types that would normally receive a mutable assessment, but we know for sure that they are practically immutable. A famous example of such a type, both in Java and in C#, is class `String`. In such cases, we must be able to preassess the type as immutable, which means to assign an immutable assessment to the type, without analyzing it.

    Note that preassessment constitutes a promise, and promises can be false. If a type which is actually mutable is preassessed as immutable, bad things are bound to happen.

    Preassessment is mainly intended for types that have been defined by others, and we have no control over their source code. For types that we write ourselves, we want to allow most fields to be assessed the normal way, and only override the assessment of specific fields. For that, we use field overrides.

    Invariable Field Overrides

    Sometimes a field is variable, but we want to promise that we will only vary it in an effectively immutable way. For such cases, there must be an annotation/attribute that we can attach to that field, to indicate that analysis should treat the field as invariable.

    Invariable Array Field Overrides

    Arrays are by definition mutable in Java and C#, and by extension so is any type that contains an array field, even if the field itself is invariable. If we want to be able to create an immutable type that contains an array field, there must be an annotation/attribute that we can attach to that array field, to indicate that analysis should treat the array itself as invariable.

    Generic Shallow Immutability Preassessment

    Some generic types are effectively immutable containers. This means that the type is mutable or immutable if its generic type arguments are mutable or immutable, but if its generic type arguments are inconclusive, then every contained instance must be assessed.

    In such cases, we must be able to issue a special preassessment which includes an object known as a deconstructor. If the generic type arguments are inconclusive, then analysis must invoke the deconstructor to enumerate the contents of the instance, so that they can be assessed.

    Deconstructors are generally trivial:

    • The deconstructor for collections simply yields all the elements of the collection. 
    • The deconstructor for maps/dictionaries simply yields all the entries/key-value pairs.
    • The deconstructor for `Lazy<T>` simply yields the one and only value contained within the lazy object.

    Elucidation

    Once we have immutability assessment working as described in the preceding sections, a new challenge becomes apparent: sometimes, a data structure that was intended to be immutable will be assessed as mutable due to some tiny mistake. If the data structure is large and complex, it might not be obvious where the mistake is. The programmer will receive a mutable assessment, but will not know why it was given and where to look to find the problem.

    For this reason, every mutable instance assessment must come with a sentence explaining to the programmer why the assessment was issued. Since every mutable instance assessment typically has one or more other assessments that are the reasons that led to it, these sentences will often form entire trees, each sentence being further explained by nested sentences.

    I call this feature elucidation.



    Further reading

    Eric Lippert's must-read post about the different kinds of immutability: Immutability in C# Part One: Kinds of Immutability

     


    Appendix

    Immutability assessment is awesome, but the more the compiler can do for us, the better.

    Here are some examples of what compilers of (non-purely functional) programming languages could be doing for us in the direction of compiler-enforced immutability:

    • A language could support an 'immutable' class modifier, which would require the class to contain only immutable members. An immutable class may not extend a mutable class, and a mutable class may not extend an immutable class. (Although a mutable class may extend a class which has not been marked as immutable, even if that class happens to be immutable.)
    • A language could support an 'immutable' generic parameter constraint, which would mandate that only immutable types can be used as generic type arguments.
    • A language could support a 'stable' field modifier, allowing a mutable field to appear in an immutable class, and acting as a promise that the field will only be mutated in a way which upholds effective immutability.
    • A language could support a 'stable array' field modifier for array fields, allowing an array to appear in an immutable class, and acting as a promise that the contents of the array will either not be mutated, or they will only be mutated in a way which upholds effective immutability.
    • etc.

     


     

    Cover image created by ChatGPT. The prompt used was: "Please give me a photographic quality image of a big diamond floating against a completely black background. Make it landscape."

     


     

    Scratch

    (Ignore)

    - As it turns out, the mutability of value types is largely irrelevant, as explained here:
    [Vladimir Sadov: "C# Tuples. Why mutable structs?"](http://mustoverride.com/tuples_structs/)


    No comments:

    Post a Comment