michael.gr: Call Graph Acyclicity

Abstract

In technical design of software systems as conventionally practiced, call graphs often contain cycles. We show that cyclic call graphs are highly problematic for a number of reasons, the most important being that they require careful handling on a case-by-case basis by custom-written code, thus preventing the standardization, and therefore the automation, of system assembly. We discuss refactoring strategies for systematically eliminating call cycles, including a universally applicable technique for trivially eliminating a certain common type of call cycle. We conclude that since call cycles can be avoided or eliminated, they can be comprehensively disallowed, thus paving the way for the standardization and automation of system assembly.

(Useful pre-reading: About these papers)

What is a cycle in the call graph

When component A invokes component B, and component B also invokes component A, we say that the call graph contains a direct cycle. If A invokes B, which invokes C, which in turn invokes A, we say that the call graph contains an indirect cycle. If A invokes itself, we say that the call graph contains a self-loop, or a buckle, which is a special case of a cycle. In general, if a component diagram contains any path of invocations starting at a certain component and arriving back at the same component, the call graph contains a cycle.

As an example, let us consider the simplest possible scenario, consisting of just two components:

A temperature sensor component, whose job is to obtain a temperature value from some piece of hardware, and make that value available within the software system.
A temperature indicator component, whose job is to display a temperature on the screen.

In this diagram, each component has inputs and outputs, collectively known as pins:

An input is an endpoint for receiving incoming interface invocations; it represents an interface exposed by a component for invocation by other components. It is signified by an arrow pointing into the component.
An output is an endpoint through which a component places outgoing interface invocations; it represents an interface that a component wants to invoke. It is signified by an arrow pointing out of the component.

For those familiar with UML component diagrams, an input is a provided interface in UML, and an output is a required interface in UML. In this paper we use arrows to show the direction of invocations from output to input, instead of the socket-and-lollipop notation of UML.

Each input and output has a name and a type. Obviously, an output can be connected to an input only if their types match.

The temperature sensor:

Has an input called Reading, of type ReadonlyFloat, which can be invoked by some other component to obtain the current value of the temperature.
Has an output called Changed, of type Procedure0 (same thing as the "Runnable" of Java or the "Action" of C#) that it invokes in order to indicate that the value of the temperature has changed.

The temperature indicator:

Has an output which is called Reading, of type ReadonlyFloat, that it invokes in order to obtain the current value of the temperature.
Has an input called Refresh, of type Procedure0, which can be invoked to cause the indicator to re-display the current temperature value.

In the diagram, the Reading output of the indicator has been connected to the Reading input of the sensor, and the Changed output of the sensor has been connected to the Refresh input of the indicator.

Note that this design has a call cycle in it: The indicator invokes the sensor to obtain the current temperature, but the sensor also invokes the indicator to tell it that the current temperature has changed.

Dependencies

In michael.gr - Types of dependencies I differentiate between compile-time (static) dependencies, assembly-time (semi-dynamic) dependencies, and post-assembly-time (fully dynamic) dependencies.

Compile-time dependencies have already been given a lot of consideration and the general consensus is that they better not be cyclic. Most build systems prohibit static dependency cycles between build modules; for example, in the Java world, Maven artifacts cannot circularly depend on each other; similarly, in the dotnet world, MSBuild projects cannot circularly depend on each other. However, programming languages usually allow compile-time dependency cycles within program code: if classes A and B are defined within the same build module, it is usually possible to have A contain a hard-coded invocation to a method of B, and for B to also contain a hard-coded invocation to a method of A. Nonetheless, software architecture is not concerned with hard-coded invocations; therefore, in this paper the term "dependency" does not refer to compile-time (static) dependencies.

Assembly-time dependencies are what software architecture is mostly concerned with, and as such, this is the sense in which the term "dependency" is used in this paper.

Post-assembly-time dependencies are useful, as we will see, in certain techniques for eliminating call cycles; however, they do not affect the topology of a design, (they are implementation details which are not representable in a component diagram,) and as such these are not the kind of dependencies that we are referring to when we speak of dependencies in this paper.

Are call cycles common?

In software design as conventionally practiced, cycles in the call graph are a frequent phenomenon. Software architects often have no qualms about producing a design where A calls B and B also calls A.

In the literature we find statements endorsing this practice. For example, in the seminal paper "A Laboratory For Teaching Object-Oriented Thinking" (1989) by Kent Beck (W) and Ward Cunningham (W), the authors acknowledge that many components act as servers "with little regard or even awareness of [their] client", but also find it perfectly normal for some components to be "near-equals" in a "symmetric relation". That paper introduced the term "collaborator", which became a staple term in the software engineering discipline, specifically in order to allow for bidirectional interaction between components, as opposed to the already-existing term "dependency", which implies a one-way interaction. (For more on this, see michael.gr - Definition: Collaborator)

The problem with cyclic call graphs

Cyclic call graphs constitute tight coupling (W). This in turn has a severe negative effect on the understandability and maintainability of software. Wikipedia (W) lists some specific disadvantages of tight coupling:

A change in one module usually forces a ripple effect of changes in other modules.
Assembly of modules might require more effort and/or time due to the increased inter-module dependency.
A particular module might be harder to reuse and/or test because dependent modules must be included.

The second point might be a bit vague, but it is very important, so it is worth examining it in more depth. The term "assembly of modules" refers to the process of instantiating each component that makes up the system, and wiring the components together so that the system can start running. In this paper, we call this process system assembly.

The simplest approach to system assembly is to pass to each component all of its dependencies as constructor parameters when instantiating it, so that immediately upon construction the component is ready to start performing its duties. However, this approach is not viable if the call graph contains cycles, because circular dependencies introduce a chicken-and-egg problem: If each component requires all of its dependencies to be passed as constructor parameters, and if components A and B depend on each other, then A can only be constructed if B has already been constructed, but B cannot be constructed unless A has been constructed first.

This problem is pervasive, but it has not received much attention because we are resigned to software development being a largely unstandardized, labor-intensive process where copious amounts of custom-written code provide ad-hoc solutions to long-standing problems on a case by case basis. During system assembly, developers tend to wire as many components as they can during construction, and when a certain wire turns out to form a call cycle, they make a special case and refactor the components involved so as to postpone the wiring of that particular call until after construction. (This often leads to order-of-initialization bugs, which require painstaking effort to troubleshoot and fix.)

As the system evolves, and wires between components are added or removed, developers try to keep components unchanged by re-arranging the order in which they are instantiated, and when this is not enough, they further refactor components, turning more construction-time wiring into post-construction-time wiring, and vice versa. (Invariably resulting in more order-of-initialization bugs, and more painstaking effort to troubleshoot and fix them.)

Conventional software development practices often utilize dependency injection frameworks (W) to handle the wiring of components. Such frameworks work as if by magic, which is by some schools of thought undesirable by definition; they also represent substantial runtime overhead, so they are unsuitable for certain classes of applications, e.g. for embedded systems. Some dependency injection frameworks do not solve the problem of circular dependencies, because they simply prohibit them, whereas others attempt to solve the problem by transparently creating proxy objects, which postpone wiring until some post-construction moment, and are therefore doubly magical. The problem with proxy objects is that they tend to fail if invoked from within a constructor, and when this happens it is extremely difficult to troubleshoot and fix. Most importantly, dependency injection frameworks tend to hide dependencies from view, while the goal of software architecture is precisely the opposite: to keep dependencies into view.

The promise of authoritative technical software design, where the end-system is automatically generated from the design with no human intervention, requires us to stop writing custom code which wires components together in ad-hoc ways, and to replace it with a universally applicable, fully automated mechanism for assembling a system. In order for this mechanism to be fully automated, it must be fully standardized. If we were to try to standardize system assembly while allowing call cycles, we might for a moment imagine that we could accomplish our goal with a three-phase approach:

(Note: I am not actually recommending this! It will not work!)

Construction: All components are instantiated in an unconnected state.
Wiring: Now that all components exist, each component receives its dependencies.
Showtime: A special event is broadcast to all components, letting them know that wiring is complete, so they can now perform their initialization and start performing their duties.

This three-phase approach imposes a number of bureaucratic requirements on each component:

Each component must support some means of receiving references to its dependencies after construction. This constitutes incidental complexity.
Each component must support some means of receiving the showtime event, so that it can perform the initialization that it would have otherwise performed in its constructor. This also constitutes incidental complexity.
The requirement for components to be able to receive their dependencies after construction and to respond to the "showtime" event necessitates the introduction of some `IComponent` interface, which must be implemented by all components. This ties all components to the framework which defines `IComponent` and knows what to do with it.
The member fields in which a component stores the references to its dependencies are initialized after construction, and therefore must be declared as mutable, even though in principle they ought to be immutable. Similarly, the initialization performed during the showtime event often generates information that needs to be stored in member fields for later use. These member fields are also initialized after construction, so they must also be declared as mutable, even though in principle many of them ought to be immutable. Thus, any notion of immutability goes out the window. Many components might still be effectively immutable, but all of them are very mutable as far as any code analysis tool can tell.

Most importantly, the three-phase approach does not solve the chicken-and-egg problem that we mentioned earlier, it only postpones it in time: When an event is triggered, the order in which event handlers are invoked is undefined. This means that during the processing of the showtime event a component may attempt to invoke another component which has not yet received the event, and therefore has not yet performed its initialization.

Even if we were to further complicate things by introducing some additional mechanism that would give programmers control over the order in which components process the showtime event, the problem still remains: The presence of cycles in the call graph always leaves open the possibility that some components will be invoked before they have been initialized.

It should by now be evident that cyclic call graphs are highly problematic, and that if they are to be allowed then there is no way to standardize system assembly. It remains to be shown whether call cycles can be systematically avoided or eliminated, and thus disallowed.

Solving the trivial case

If our goal is to eliminate the cycle in the call graph of the earlier example with the temperature sensor and the temperature indicator, we can trivially accomplish it by applying the Observer Pattern (W), as shown in the following figure:

Note that the sensor does not invoke the indicator anymore; instead, the indicator invokes the sensor not only to read the current temperature but also to register an observer for temperature change notifications. The fact that the sensor will then be invoking that observer is an implementation detail which does not affect the topology of the design; thus, this design is free of cycles.

Practically, the use of the observer pattern means that the sensor cannot invoke the indicator before the indicator has registered its observable, and this in turn means that the indicator can never be invoked before its initialization is complete.

By eliminating the call cycle between the sensor and the indicator, we end up with a system that has a specific, computable order of initialization which is guaranteed to be free of problems: the sensor does not depend on the indicator anymore, so it can always be constructed first. The indicator, which depends on the sensor, can always be constructed after the sensor, so it can receive its dependencies as constructor parameters.

As a result, each component can store all of its dependencies in immutable member variables. The only interface that needs to be stored in a mutable member variable is the callback that the observable receives from the observer, and this is in line with the nature of the observer pattern, where registration and de-registration of callbacks necessarily involves mutation.

Pins of type `Observable<T>` are bound to occur so often in software designs, that they warrant some special notation in order to:

Simplify their representation.
Make them more conspicuous.
Save space in the diagram.

The following figure shows an example of what this notation could look like:

In the above diagram, the following changes have been made to pins of type `Observable<T>`:

The generics notation (`Observable<T>`) has been replaced with tilde notation (`~T`). The tilde indicates that the type of the pin is not really of type `T`, it is of type `Observable<T>`.
The arrows of the observable pins have been replaced with slightly larger circles containing slightly smaller arrows pointing in the opposite direction.

The encircled arrows are not pointing from the output to the input as normal arrows do; instead, the encircled arrows are showing the direction of callback invocations, so they are pointing from the input to the output. Essentially, a circle signifies inversion of the direction of invocations, so an encircled arrow corresponds to a normal arrow of the opposite direction.

If the callback interface does not contain any methods that return information, (as the case is with all notification interfaces,) the above design can be improved even more.

First, note that the refactoring of an input-output pair to an observer-output-observable-input pair results in components that violate the Single Responsibility Principle (SRP) (W):

Instead of simply exposing a Changed output, the sensor now has to implement the functionality of an event manager, so as to offer the same notification as an observable input.
Similarly, instead of simply exposing a Refresh input, the indicator now has to contain a few more lines of code to register a callback in order to receive the same notification as an observer.

The above may not represent a lot of work, but it is nonetheless a refactoring which must be applied to the code in order to support the needs of the design; however, in a different design, the observer pattern might be unnecessary, so why should the components be hard-coded to support it?

To solve this problem, let us revisit our first design, where we had a temperature sensor with a simple Changed output and a temperature indicator with a simple Refresh input, and hence a call cycle. Let us us now introduce two new components into the design, where one is a General-Purpose Observable and the other is a General-Purpose Observer.

The job of the observer is simply to register a callback via its Registration output during construction, and from that moment on to keep echoing each invocation of the callback to its Trigger output.

The job of the observable is to receive an observer registration via its Registration input, and to keep echoing invocations coming into its Trigger input to the registered observer, if any.

Note that with this arrangement, the call cycle is still eliminated, and at the same time we have managed to retain the sensor and indicator components in their original form, with no code refactoring necessary, since the refactoring has now been applied to the design. Also note that the SRP is being nicely upheld.

I postulate that the combination of a general-purpose observer and a general-purpose observable will be occurring quite frequently, to the point where it might be worth simplifying their representation using some special notation. For this purpose, I propose an air-gap pseudo-component. With the use of an air-gap, the previous figure turns into the following:

Note that the symbol for the air-gap component has been borrowed from electronics, where it stands for a capacitor. An electronic capacitor is also, in a sense, an air gap; however, the similarity is superficial, and it is only meant to serve as a mnemonic: In software, an air-gap pseudo-component does not maintain any charge, nor does it act as some kind of high-pass filter, etc.; it just allows us to pick a wire that takes part in a call cycle, and trivially make that wire break the cycle.

Also note that the air-gap is not a real component, it is a pseudo-component. This means that it is simply a notation, which represents an underlying pair of actual components: a general-purpose observer, and a general-purpose observable. This is necessary because these two components will invariably need to be constructed at different times during system assembly. In the sensor-indicator example, construction would take place in the following order:

The general-purpose observable is constructed first because it does not depend on anything else.
Then, the sensor is constructed, which depends on the general-purpose observable.
Then, the indicator is constructed, which depends on the sensor.
Finally, the general-purpose observer is constructed, which depends on both the indicator and the general-purpose observable.

So, during system assembly, all air-gap pseudo-components are replaced with the components that they represent, so that all components can be properly instantiated and wired in the order dictated by their dependency graph.

Finally, note that in cases where the callback interface returns information back to the caller, the general-purpose observable component (and therefore the air-gap pseudo-component) cannot be used. In these cases, the caller must incorporate the observable functionality within itself, so as to be able to cope with the situation where no observer has yet been registered and therefore no information can be returned. The general-purpose observer component can be used in all cases.

Solving non-trivial cases

The application of the observer pattern is a good first step in the direction of being able to express any software design acyclically; however, it does not cover all cases. Specifically, the observer pattern cannot be used under the following circumstances:

The observable expects information to be returned back from the observer, and it is incapable of handling the case where no observer is registered, and therefore no results can be returned. (For example, the observable needs to invoke the observer and receive information back from the observer during the observable's construction, at which point the observer cannot possibly have registered yet.)

The simple Changed notification of the temperature-sensor-and-indicator example does not fall under these circumstances because it is in the nature of notifications that they never return any information; however, in other scenarios, these circumstances can arise. Thus, it remains to be shown how call cycles can be eliminated when the observer pattern is inapplicable.

Strategy: Fusion

The presence of a call cycle between components A and B might indicate that perhaps they should not be separate components, and that we might be better off by fusing A and B into a single component. In doing so, we remove the cycle from the topology of the design by turning it into an implementation detail of the new component. The following figure illustrates this:

Clearly, the left diagram contains a cycle, whereas the right diagram does not.

Needless to say, this strategy is only marginally useful, and it should not be considered unless all else fails, because the goal of software architecture is to distribute functionality into as many components as possible, so that each component can be as simple as possible, instead of conglomerating functionality into monolithic components. The fusion strategy is mentioned here only for the sake of completeness.

Strategy: Plain Fission

The presence of a call cycle between components A and B might indicate that at least one of the two components violates the Single Responsibility Principle (SRP) (W). In this case, one of the responsibilities can be extracted into a separate component which only exposes inputs, thus breaking the cycle, as the following figure illustrates:

In this diagram, component A has been split into A1 and A2. Both A1 and B invoke A2, but A2 does not invoke anything, so there is no cycle.

This can happen, for example, if component A contains both a data model and some logic acting upon the data model. If A is incapable of fully encapsulating the data model, it may have to expose not only an output for interacting with the rest of the system, but also an input for the rest of the system to interact with the data model. With the fission refactoring, component A has been split into one component for the logic (A1) and a separate component for the data model (A2).

Strategy: Observable Fission

The astute reader might notice that in the plain fission example, components A1 and A2 are not exactly equivalent to component A: In the original diagram A used to be able to take notice of incoming calls from B intended for the data model, and could therefore take action in response to those calls, whereas in the refactored diagram A1 is oblivious to any calls that B makes to A2.

If A1 needs to be aware of such calls, this can be very easily accomplished by having A2 issue change notifications, and wiring these notifications back to A1. This new wire does not introduce a call cycle, because as we have already shown when describing the trivial case, notifications can always be air-gapped.

The following figure illustrates the application of the observable fission strategy:

Conclusion

We have shown that cyclic call graphs are highly problematic for a number of reasons, the most important being that they require careful handling on a case-by-case basis by custom-written code, thus preventing the standardization, and therefore the automation, of system assembly.

We have discussed refactoring strategies for systematically eliminating call cycles, including a universally applicable technique for trivially eliminating a certain common type of call cycle.

We conclude that since call cycles can be avoided or eliminated, they can be comprehensively disallowed, thus paving the way for the standardization and automation of system assembly.

2023-12-29

Call Graph Acyclicity