2022-05-02

Bathyscaphe

The Bathyscaphe logo, a line drawing of bathyscaphe Trieste
by Mike Nakis, based on art found at bertrandpiccard.com

Abstract

This article introduces Bathyscaphe, an open-source java library that you can use to assert that your objects are immutable and/or thread-safe.

The problem

Programmers all over the world are embracing immutability more and more; however, mutation is still a thing, and in all likelihood will continue being a thing for as long as there will be programmers. In a world where both mutable and immutable objects exist side by side, there is often a need to ascertain that an object is of the immutable variety before proceeding to use it for certain purposes. For example, when an object is used as a key in a hash map, it better be immutable, or else the hash code of the key may change, causing the map to severely malfunction.

Furthermore, when an object is not immutable, there is often the need to ascertain that it is at least thread-safe before sharing it between threads, otherwise there will be race conditions, with catastrophic results.

Note that when any of the above goes wrong, it tends to be a bug which is very difficult to troubleshoot.

Unfortunately, assessment of thread safety and immutability is not an easy task. Most don't even consider it, few talk about it, even fewer attempt it. Programmers all over the world are accustomed to routinely using objects in situations where thread-safety and/or immutability are absolute requirements, but without ever ascertaining them, essentially praying that the objects be thread-safe and/or immutable. 

As far as I can tell, in the world of the JVM there exist no libraries that will ascertain thread-safety. As for immutability, there are some that purport to do so, but Judging by how marginal status these libraries have in the greater technology landscape, they are not being put into much use. This is not surprising, because they rely exclusively on static analysis, which does not really solve the problem, as I will show. 

Introducing Bathyscaphe

Bathyscaphe aims to give the Java world another chance at addressing the problem of thread-safety and immutability assessment instead of letting it linger on like a chronic ailment. Bathyscaphe is really easy to use, and produces correct and useful results. It is also very small:

  • The JAR file is only about 100 kilobytes.
  • Setting aside the test module, which necessarily depends on JUnit, Bathyscaphe does not have any dependencies outside the Java Runtime Environment. Let me repeat this: Bathyscaphe. Has. No. Dependencies. It depends on nothing. When you include Bathyscaphe in a project, you are including its tiny JAR file and nothing else.

Why existing solutions do not work

Oftentimes we can tell whether an object is mutable or immutable just by looking at its class, and indeed there exist static analysis tools that examine classes and classify them as either mutable or immutable. The widespread understanding is that once a class has been classified, all instances of that class can receive the same classification. However, in many cases it is not enough to just look at the class to determine immutability; instead, it is necessary to examine each and every instance of the class at runtime. When static analysis tools assess such classes, they yield results that are erroneous, or in the best case useless.

Examples where static analysis does not work and cannot work:

  • Static analysis does not work when a class contains a field which is final, receives its value from a constructor parameter, and the type of the field is an interface or a non-final class. Static analysis can determine that the field itself will not mutate, but has no way of knowing whether the value referenced by the field can mutate.
    • In order to err on the safe side, static analysis tools tend to assess classes containing such fields as mutable, but this is arbitrary, and it constitutes a false negative when the field is in fact initialized with an immutable value.
  • Static analysis does not work when a class is an unmodifiable collection of elements, where the elements can be of any type. The most famous examples in this category are the JDK-internal classes `java.util.ListN` and `java.util.List12`, instances of which are returned by `java.util.List.of()` and its overloads.
    • Some static analysis tools assess such classes as immutable, which can be a false positive, e.g. in the case of `List.of( new StringBuilder() )`.
    • Some static analysis tools assess such classes as mutable, which can be a false negative, e.g. in the case of `List.of( 1 )`.
  • Static analysis does not work when a class is freezable. By this we mean a class whose instances begin life as mutable, and are at some point instructed to transition from being mutable to being immutable. For an explanation as to why freezable classes are important, see related appendix.

From the above it follows that in many cases, examining a class is not enough; in these cases, we need to examine each and every instance of the class at runtime. Furthermore, we need to examine not only the instance at hand, but the entire object graph referenced by that instance. In other words, we must not just assess shallow (superficial) immutability, we must assess deep immutability. That's what Bathyscaphe does. And that's why it is called Bathyscaphe.

How Bathyscaphe Works

Bathyscaphe does not attempt to analyze bytecode and detect whether there exist code paths that may mutate a field, or that the paths which do in fact mutate a field only do so while taking measures to guarantee thread-safety. That kind of detective work belongs to the realm of static analysis tools, and is completely outside the scope of Bathyscaphe. 

What Bathyscaphe does, in a nutshell, is to use reflection to examine each field of a class, and recursively the type of each field. If a conclusive assessment can be obtained by just looking at the class, Bathyscaphe will issue that assessment for all instances of that class. However, if the actual type of the runtime value of a certain field cannot be known by examining the declaring class, then for each instance of the declaring class Bathyscaphe will read the value of the field at runtime, obtain the actual type of the value, and recursively assess that type.

Bathyscaphe is guided by annotations that mark invariable and thread-safe fields in a class. These annotations are essentially claims made by the programmer: Bathyscaphe does not, and cannot, verify the truthfulness of these claims. In this sense, Bathyscaphe does not provide a 100% fool-proof solution, because the programmer may code these annotations wrongly. In the future some synergy between Bathyscaphe and static analysis tools might be achieved, so as to provide 100% fool-proof results, but the benefit of using Bathyscaphe now lies in the fact that given correct annotations, Bathyscaphe will yield correct and usable results in all cases, whereas static analysis does not work in all cases and by its nature cannot work in all cases.

Where to find Bathyscaphe

Bathyscaphe is hosted on GitHub; see https://github.com/mikenakis/Bathyscaphe




Appendix: Goals of Bathyscaphe

I decided to write my own immutability assessment facility with the following goals in mind:

  • I want to be able to write framework-level code such as the following:
    • A hash-map which asserts that any and all keys added to it are immutable.
    • A message-passing framework which asserts that every single message that it is asked to deliver is either immutable or at the very least thread-safe.
  • I want results that are always accurate, meaning that there must be no false positives or false negatives, no compromises, no "aiming to cover the majority of use cases". All use cases should be covered, and they should be covered correctly.
  • I want to assess the immutability of objects, not classes, because I have observed that from a certain class we can sometimes construct instances that are mutable, and sometimes construct instances that are immutable. For example, both of the following method calls yield instances of the exact same class, and yet one instance is mutable, while the other instance is immutable:
    • `List.of( 1 )` (immutable)
    • `List.of( new StringBuilder() )` (mutable)
  • I want to assess the immutability of the entire graph of objects referenced by a certain object, not the immutability of that object alone. In other words, I want deep immutability assessment, as opposed to shallow or superficial immutability assessment.
  • When assessment cannot be achieved in an entirely automatic fashion, (as the case is, for example, with classes that perform lazy initialization,) I want to be able to achieve it by either:
    • adding special annotations to certain fields, or
    • adding a manual preassessment (assessment override) for that specific class.
  • I want the immutability assessment facility to account for freezable classes. This necessitates the introduction of a special self-assessment interface, so that instances can be asked whether they are immutable or not at any given moment.
  • When an immutability assertion fails, meaning that an object which I had intended to be immutable has been found to actually be mutable, I want to receive extensive diagnostics in human-readable form, explaining precisely why this happened.
  • I want the immutability assessment library which achieves all this to be attractive to programmers, by being:
    • very easy to integrate
    • very easy to use
    • very small
    • having no dependencies.

Appendix: Non-goals of Bathyscaphe

  • Predicting what code will do.
    • That is the job of static analysis tools. Bathyscaphe is meant to issue accurate and useful assessments assuming correctly annotated classes. The correctness of the annotations is a lesser, and largely different problem, which is suitable as the focus of static analysis tools.
  • Dealing with untrustworthy classes.
    • Immutability can always be compromised via reflection, so trying to assess immutability in an environment which is not completely trustworthy is a hopeless endeavor. 
    • Therefore, assessment is to be done on a full-trust basis.
  • Dealing with buggy classes.
    • If a class promises, either by means of annotations or the self-assessment interface, that it will behave immutably, but in fact it does not, the fault is with that class, not with the immutability assessment facility.
  • Dealing with inaccessible classes.
    • Due to security restrictions, the inner workings of certain JDK classes are inaccessible.
    • Since every single one of those classes can receive a manual preassessment, this is not an issue.
  • Dealing with farcery.
    • If we create a subclass of a mutable class and override each mutation method to always throw an exception, do we have a mutable or immutable class in our hands? 
    • Some say it is mutable; 
    • others say it is immutable; 
    • I say it is a farce, and not worthy of consideration.
  • Performance.
    • Immutability assessment can be computationally expensive, but it is only meant to be performed through assertions, so its overhead is to be suffered only on development runs. 
    • On production runs, where assertions are supposed to be disabled, the performance penalty of using Bathyscaphe is to be zero. 
    • Therefore, performance is not an issue.
  • Non-assertive assessment.
    • Non-assertive assessment means yielding an assessment result object which can then be examined, as opposed to assertive assessment which means either passing the check or throwing an exception. 
    • Non-assertive assessment would require publicly exposing the entire assessment hierarchy of Bathyscaphe, which would then make bathyscaphe impossible to refactor without breaking code that is already making use of it. 
    • Therefore, non-assertive assessment is not a goal.
  • Static analysis.
    • While it is indeed possible in many cases to conclusively assess a class as mutable or immutable by just looking at the class, in many other cases (and certainly in all interesting cases) examining the class is not enough, as the example of `List.of( 1 )` vs. `List.of( new StringBuilder() )` demonstrates. 
    • Thus, the use of Bathyscaphe as a static analysis tool is not a goal.
    • If you need a static immutability analysis tool for Java, please see "MutabilityDetector" on github: https://github.com/MutabilityDetector

Appendix: A note on reference types

If you decide to incorporate Bathyscaphe in a project, the first thing you are likely to do is what I did: introduce your own HashMap class which asserts that every key added to it is immutable. In doing so you might discover some bugs in your code, but you will also notice something seemingly strange: Bathyscaphe is preventing you from using reference types as keys, which kind of makes sense because they are in fact mutable, but you have never had any issues with that before, so why is it becoming a problem now?

What is happening is that your reference types do not override `hashCode()`, so they inherit the identity hash-code from `Object`, which remains constant throughout the lifetime of your object, despite the mutations that your object undergoes during its lifetime. So, it has been working, but it has only been working by accident.

Bathyscaphe is meant to be used precisely in order to avoid accidents, so you cannot keep doing this anymore. From now on, you will have to be using `IdentityHashMap` for reference types, and `HashMap` for value types.

Appendix: A note on so-called immutable collections

When Java 9 was introduced, the documentation referred to the objects returned by the various overloads of the new `java.util.List.of()` method as immutable lists. Specifically, in the Java 9 API docs we read "Returns an immutable list containing one element." Later, the Java people realized that this is inaccurate, so in JDK issue 8191517 hey decided among other things to "Adjust terminology to prefer 'unmodifiable' over 'immutable'." Thus, if we look at the documentation today, (for example, in the Java 18 API documentation,) it reads "Returns an unmodifiable list containing one element."

Dropping the word "immutable" was the right thing to do, because there is no such thing as an immutable collection, at least when type erasure is involved. That's because a collection contains elements, the immutability of which it is in no position to vouch for.

Unfortunately, the term "unmodifiable" is also problematic for describing these collections, because the term already had a meaning before `List.of()` was introduced, and the meaning was "an unmodifiable-to-you view of my collection, which is still very mutable, and any mutations I make will be visible to you." 

Luckily, `List.of()` does better than that: it returns a list that cannot be modified by anyone. So, I would rather call it "unchangeable" or "superficially immutable" to indicate that it falls short of achieving true immutability only in the sense that it cannot guarantee deep immutability.

Appendix: A note on assessment overrides

An assessment override on an effectively immutable class (for example, on a class which contains a lazily initialized field) is a drastic measure which should be used as seldom as possible. That's because an assessment override is also a blanket measure: it will prevent the immutability assessment facility from ascertaining the immutability of not only the lazily initialized field, but also of all other fields in the class, and in so doing it may hide errors. Assessment overrides should only be used on classes whose source code we do not control, and therefore we cannot annotate on a field-per-field basis.

Appendix: Freezable classes

As a rule, immutable objects tend to be immutable-upon-construction, meaning that any and all objects that they reference must be supplied as constructor parameters. There is, however, an exception: there is a category of objects called "freezable" which begin their life as mutable, (so that they can undergo complex initialization,) and are at some later moment instructed to transition to being immutable, that is, to "freeze". Freezing happens in-place, it is permanent from the moment it is applied, and it is trivial to implement: all it takes is to set a `frozen` field to `true`.

  • Freezing is useful for performance:
    • Creating a mutable object, initializing it, and then freezing it performs much better than creating a mutable object, initializing it, and then copying its contents into a freshly allocated immutable object.
  • Freezing can achieve things that are otherwise hard, or impossible:
    • The creation of immutable cyclic graphs requires objects to be mutable while the graph is being constructed, and become immutable in-place once construction is complete. This problem cannot be solved using the builder pattern, because the builder is bound to run into the same problem: how to construct A with a reference to B when B must be constructed with a reference to A.

To accommodate freezable classes, Bathyscaphe introduces the `ImmutabilitySelfAssessable` interface. If a class implements this interface, then Bathyscaphe will be invoking instances of this class to ask them whether they are immutable or not. 

No comments:

Post a Comment