michael.gr: Towards Authoritative Technical Software Design

Abstract

This paper examines the long-standing need within the Software Engineering Discipline for technical design documents that are authoritative. A design document is authoritative if there exist technical means of materializing it as a running software system, thus guaranteeing that the end result is indeed precisely as described by the design. We notice the scarcity and inadequacy of existing solutions, we look into the difficulties involved in the creation of such documents, and we conclude with some realizations on what it would take to come up with a solution that works.

(Useful pre-reading: About these papers)

Summary

The usual means by which we describe the architecture of software systems today are whiteboard, paper, or in the best case some general-purpose box-and-arrow drawing application. These traditional means of design are distanced from the engineering entities that they are dealing with, because they lack technical means of accomplishing the following:

Informing the design with what components are available for inclusion.
Restricting the design to only valid ways of combining such components.
Materializing the design into a running software system.

These shortcomings have two serious consequences:

Designs tend to have arbitrary notation, arbitrary content, arbitrary semantics, and even arbitrary levels of abstraction, making them works of art rather than works of engineering: they are nothing but mere suggestive sketches of the vague wishes of the architects, bearing no necessary relationship to reality.

Software engineers and operations engineers do somehow manage to build workable systems out of unworkable designs, but in doing so they engage in improvisation, thus creating systems whose structure is defined not by the design document, but instead by the source code, and various scripts and configuration files scattered all over the place. This is practically equivalent to saying that knowledge of the structure of the end-system only exists not in the design document, but in the minds of some engineers.

Ideally, the technical design document should be the single authoritative source of truth for the structure of the system that it describes, but traditional software designs do not accomplish this. Thus, traditional designs are non-authoritative.

Prior Art

In long-established engineering disciplines such as mechanical, electrical, civil, etc., design work has been facilitated by Computer-Aided Engineering (CAE) tools (W) for several decades now. In electronic engineering, which is the discipline from which most parallels can be drawn to software engineering, virtually all design work since the 1980s is being done using Electronic Design Automation (EDA) / Electronic Computer-Aided Design (ECAD) tools (W). These tools have revolutionized the drafting process by using computers to develop, modify, and optimize product designs, as well as to perform simulation and validation of products before they are built.

The documents created by ECAD tools are great for communicating designs to humans: a newly hired electronic engineer begins their first day at work by studying schematic diagrams, and before the end of the day they are often able to pick up the soldering iron and start doing productive work. Contrast this with software engineering, where a new hire usually cannot be productive before spending weeks studying source code and documentation, and having numerous knowledge transfer meetings with senior engineers who already know the system.

Most importantly, ECAD tools minimize human error by helping to bridge the gap from the physical world to the design, and from the design back to the physical world. The tools have built-in knowledge of electronic components available for inclusion in a design, and electronic manufacturing has advanced to the point where a design document can be turned into a functioning circuit board with nearly zero human intervention. Thus, electronic design documents today are authoritative: the end products are accurately described by their designs.

Unfortunately, thus far, the software engineering discipline has been following a different path, where design documents are scarce, and authoritative design documents are completely non-existent. This situation has been allowed to go on for so long, largely because we already have a certain other kind of document which is authoritative, and this is the source code.

However, source code is an implementation, or at best a detailed technical description, but not a design. To say that the design of a several-million-source-code-line software system is the several million lines of source code that make up that system is equivalent to saying that the design of the Boeing 747 is a listing of the several million individual parts that make up a Boeing 747. A design is supposed to list operative components, and to show how they are interconnected, but not to delve past the level of detail of the component. Unfortunately, we do not have that in software engineering, at least not in an authoritative form.

Through the years, many Computer-Aided Software Engineering (CASE) tools (W) have been developed with the aim of aiding the software design process by making it visual rather than textual, but it would be a mistake to think that they come anywhere close to offering a solution for authoritative software design. They all fall under one of the following categories:

Integrated Development Environments (IDEs) (W). These are restricted to fancy text editors for source code, form builders for user interfaces, and perhaps a few ways of illustrating, but not defining, various aspects of a software system under development, such as class diagrams, dependency diagrams, call trees, etc. As such, they are not software design tools.
Microsoft "Visual" programming languages, e.g. Microsoft Visual C++. There is nothing visual about them; their name is just a marketing ploy. (See michael.gr - On Microsoft "Visual" products.)
Visual programming languages (W) e.g. Snap!, Scratch, EduBlocks, Blockly, etc. They do in fact produce runnable software, but they are structurally equivalent to program code, so they express implementations rather than designs. (See michael.gr - On Visual Programming Languages.)
Diagramming software (W) e.g. Visio. They are good for making fancy diagrams, but they have no inherent understanding of the meaning of the diagrams; they are not informed by the actual available software components, nor of valid ways of interconnecting them, and they cannot automatically convert diagrams into working software systems.
Modelling Languages (W) e.g. UML (W), the C4 Model (W), etc. As the name implies, they are used for modelling, but not for actually building software systems. Some tools support automatic scaffolding code generation, but this is based on the all-design-up-front fallacy: once scaffolding code has been automatically generated, and hand-written code has been added, changing the design and re-generating the scaffolding code is bound to produce a big mess, since the hand-written code does not match the scaffolding code anymore.
UML in particular aims to standardize at least the notation of diagrams, but it is very cumbersome to work with, so a large percentage of programmers have developed an acute aversion to it. It is largely considered dead. (See michael.gr - On UML.)
Architecture Description Languages (W) These tend to be modelling languages without the modesty of admitting that they are limited to nothing but modelling.
Specification Languages (W), such as Specification and Description Language (SDL) (W). These do see some use in niche applications such as process control and real-time applications, but they are suitable for describing implementations rather than designs.
Business Process Modelling (BPM) (W) tools, e.g. Business Process Modeling Notation (BPMN) (W). They see some use in describing business processes within software systems, but not the software systems themselves.
Visualization tools, e.g. Lucidscale. These are invariably restricted to the visualization, exploration, and documentation of existing systems, rather than the design and deployment of new systems, or even the modification of existing systems. As such, they are reverse-engineering tools rather than design tools. Furthermore, they tend to be limited to specific realms, such as the cloud environments of particular vendors.
Component-Based Software Engineering (CBSE) (W) tools, e.g. Microsoft OLE, *nix command shell pipes, etc. These technologies tend to suffer from critical limitations such as:

Assuming the exclusive use of a particular programming language.
Assuming the exclusive use of a particular operating system.
Requiring debilitating amounts of bureaucracy to accomplish simple things.
Forcing software components to be heavyweight (sometimes as heavyweight as a process.)
Requiring communication between components to be done via a particular mechanism such as asynchronous message passing, command shell pipes, etc.
Complete absence of visual design tools, even though such tools could, in principle, be developed.

The shortcomings of these technologies are reflected in the limited extent of their adoption, and whatever meager adoption some of them do enjoy can usually be attributed to coercion rather than merit. For example, Microsoft's Object Linking and Embedding (OLE) is the only way to accomplish certain things under Windows, so people use it because they have to, not because they want to. Nobody does OLE if they can avoid it.

C4 Model (W) -- Work in progress.
Model-Driven Engineering (W) -- Work in progress.
Low Code Development Platforms (W) -- Work in progress.
Various infrastructure definition tools like Terraform (W) -- Work in progress.
ArchiMate (W) -- Work in progress.
SysML (W) -- Work in progress.
Rapid Application Development (RAD) tools (W) -- Work in progress.
Structure101 (->) -- Work in progress.
Lattix (->) -- Work in progress.
Structurizr DSL (->) -- Work in progress.
Rational Software Architect Designer (->) -- Work in progress.

It is a great paradox of our times that the Software Engineering Discipline is virtually bereft of authoritative design tools, when such tools are the bread and butter of the long-established engineering disciplines.

A more detailed look at the problem

Conventional means of software design suffer from the following shortcomings:

Designs often include elements that are not well-defined in engineering terms.

You see this with design documents containing supposedly technical but actually quite nebulous entities such as a data store here, a messaging backbone there, or a remote server over there. None of these entities is concrete enough and unambiguous enough to be suitable for inclusion in a technical design document.

Designs often include elements that are completely outside the realm of engineering.

You see this with design documents containing little human stick-figures representing users, pictures of money representing payments, etc. The presence of such items in a software design usually indicates a confusion between what is a technical design and what is a functional specification.

Designs often include elements from wrong levels of abstraction.

You see this with designs that mix software components with flowcharts, state diagrams, etc. Notwithstanding the fact that these are also boxes connected with arrows, they represent decision-making logic, which is an implementation detail of the component that contains that logic, and as such they have no place in a software design document.

You also see this with designs that confuse interfaces with other concepts, such as ownership, containment, inheritance, data flow, etc.

Designs are often expressed at an unworkably high level of abstraction.

The level of abstraction most commonly chosen by software architects is that of a block diagram, which might be suitable for abstract architectural work, but it does not contain enough detail to guarantee the feasibility of the technical design.

The level of abstraction necessary in order to guarantee feasibility is that of the component diagram, which shows connections between components on specific interfaces.

Unfortunately, since designs are distanced from the engineering entities that they are dealing with, they do not have enough factual information at their disposal to be able to delve into such a level of detail as necessary for a component diagram.

Designs are not informed with what elements are available for incorporation.

The medium on which designs are expressed usually provides no means of establishing or enforcing a correspondence between an element as it appears in the design, and the actual runnable software module that it represents. This can be okay in the case of modules that have not been developed yet, but more often than not a design intends to incorporate existing, ready-made, reusable modules. In the absence of any technical means for informing the design about existing modules, the design necessarily represents hypotheses and assumptions rather than fact.

Designs often prescribe invalid combinations of elements.

The ways in which a design intends to interconnect components do not necessarily match the ways in which the components can actually be interconnected.

A design may assume that a certain component exposes a particular interface while in fact the component does not expose such an interface.

A design may prescribe a connection between two components on a certain interface, while in fact the components cannot be connected because the type of the interface exposed by one component is not a valid match for the type of the interface required by the other component.

Designs fail to capture dynamic aspects of software systems.

Conventional means of software design often lack the ability to express dynamic constructs such as:

Plurality: Multiple instantiation of a certain module, where the number of instances is decided at runtime.

Polymorphism: Fulfilling a certain role by instantiating one of several modules capable of fulfilling that role, where the choice of which module to instantiate is made at runtime.

Designs are often incomplete.

A design may incorporate a component which needs to invoke a certain interface in order to get its job done, but omit incorporating a component which implements that interface. In such cases, the software system cannot be deployed as designed, and yet the architects are free to proclaim the design as complete.

The above list of problems results from the lack of technical means of informing the design with what is available and restricting it to what is possible. Correspondingly, the process of building and deploying software systems is also not informed, nor restricted, via any technical means, by the design. The end-result of all this is the following:

Software systems do not necessarily match their designs.

Even if the technical design happens to describe a software system that can actually be built as described, (which is rarely the case,) there are no technological safeguards to guarantee that it will, so the software engineers and the operations engineers are free to build and deploy a system that bears little or no relationship to the design. Neither the architects, nor the management, have any way of knowing.

Software systems diverge from their designs over time.

Even if the technical design initially matches the deployed software system, (which, again, is almost never the case,) the system is bound to evolve. The design should ideally evolve in tandem, but it rarely does, again because there are no technological safeguards to enforce this; the engineers are free to modify and redeploy the system without updating the design document, and in fact they often do, because it saves them from double book-keeping. Thus, over time, the design bears less and less similarity to reality.

If, due to the above reasons, you suspect that your technical design document does not correspond to reality, and you would like to know exactly what it is that you have actually deployed and running out there, you have to begin by asking questions to the software engineers and the operations engineers.

In order to answer your questions, they will in turn have to examine source code, version control histories, build scripts, configuration files, server provisioning scripts, and launch scripts, because the truth is scattered in all those places. In some cases they might even have to try and remember specific commands that were once typed in a terminal to bring the system to life. If this sounds a bit like it is held together by shoestrings, it is because it is indeed held together by shoestrings.

Thus, the information that you will receive will hardly be usable, and even if you manage to collect it all, make sense out of it, and update the design document with it, by the time you are done, the deployed system may have already changed, which means that the design document is already obsolete.

As a result, it is generally impossible at any given moment to know the actual technical design of a deployed software system.

This is a very sorry state of affairs for the entire software industry to be in.

Towards a solution

If we consider all the previously listed problems that plague software design as conventionally practiced, and if we look at how the corresponding problems have been solved in long-established engineering disciplines, we inescapably arrive at the following realization:

The design of a software system can only be said to accurately describe the actual running system if there exist technical means of creating the system directly from the design, with no human intervention.

In order to automatically create a software system from its design, the design must be semantically valid. This brings us to a second realization:

The semantic validity of a software design can only be guaranteed if there exist technical means of informing the design with components available for incorporation and restricting the design to only valid ways of interconnecting them.

The above statements define a software design as authoritative.

Any attempt to introduce authoritative design documents in the software engineering discipline would necessarily have to borrow concepts from the electronic engineering discipline. This means that the solution must lie within the realm of Component-Based Software Engineering (CBSE), where systems consist of well-defined components, connectable via interfaces using well-defined connectivity rules.

What we need is a toolset that implements such a paradigm for software. The toolset must have knowledge of available components, knowledge of the interfaces exposed by each component, and rules specifying valid ways of connecting those interfaces. The toolset must then be capable of materializing the design into a running software system.

The toolset must not repeat the mistakes and suffer from the drawbacks of previous attempts at component-based software engineering. Thus, the toolset must meet the following goals:

Facilitate any programming language.

By this we do not mean that it should be possible to freely mix C++ components with Java components; what we mean is that it should be possible to express in one place a C++ subsystem consisting of C++ components interconnected via C++ interfaces, and in another place a Java subsystem consisting of Java components interconnected via Java interfaces, and at a higher scope to have each these subsystems represented as an individual component, where connections between the two are made via language-agnostic interfaces (e.g. REST) or cross-language interfaces (e.g. JNI, JNA, etc.)

Facilitate any scale, from embedded systems to network clouds.

This means that the nature of a component and the nature of an interface must not be restricted, so that they can be implemented differently at different levels of scale. For example, in a certain embedded system, a component might be a single C++ class exposing C++ interfaces, whereas at the network level of scale a component is likely to be a (physical or virtualized) host exposing TCP interfaces.

Guarantee type-safety at any scale.

Type safety can be carried across different levels of scale by means of parametric polymorphism (generic interfaces.) For example, the type safety of an interface between a client and a server can be facilitated with a construct like Tcp<Rest<AcmeWeather>> which stands for a TCP connection through which we are exchanging REST transactions which abide by the schema of some programmatic interface called "AcmeWeather".

Require no extra baggage.

Components should not be required to include a lot of extra overhead to facilitate their inclusion in a design. Especially at the embedded level, components should ideally include near-zero overhead.

This means that a C++ class which accepts as constructor parameters interfaces to invoke and exposes interfaces for invocation by virtue of simply implementing them should be usable in a design as it is, or at most with a very thin wrapper.

Most extra functionality necessary for representing the component within a design should be provided by a separate companion module, which acts as a plugin to the design toolset, and exists only during design-time and deployment-time, but not during run-time.

Facilitate incremental adoption.

It should be possible to create an authoritative design document to express the design of a small subsystem within a larger system whose design has not (yet) been based on an authoritative design document.

In systems of medium scale and above, this may be handled by making the core engine of the toolset available on demand, during runtime, to quickly instantiate and wire together a small subsystem within a larger system.

In embedded-scale systems, it should be possible to utilize code generation to avoid having the core engine present at runtime.

Utilize a text-based document format.

In software we make heavy use of version control systems, which work best with text files, so the design documents must be text-based. The text format would essentially be a system description language, so it must be programmer-friendly in order to facilitate editing using a text editor or an IDE. A graphical design tool would read text of this language into data structures, allow the visual editing of such data structures, and save them back as text.

Facilitate Hierarchical System Composition.

Some systems are so complex that expressing them in a single design may be inconvenient to the point of being unworkable. To solve this issue, container components are necessary, which encapsulate entire separately-editable designs, and expose some of the interfaces of the nested design as interfaces of their own. Thus, containers can be used to abstract away entire sub-designs into atomic, opaque black-boxes within greater designs.

Facilitate dynamic software systems.

Every non-trivial system has the ability to vary, at runtime, the number of instances of some components in response to changing computation needs, and to choose to instantiate different types of components to handle different needs. Therefore, a toolset aiming to be capable of expressing any kind of design must be capable of expressing, at the very minimum, the following dynamic constructs:

Plurality: Multiple instantiation of a certain component, where the number of instances is decided at runtime.

Polymorphism: Fulfilling a certain role by instantiating one of several different types of components capable of fulfilling that role, where the choice of which component type to instantiate is made at runtime.

Polymorphic plurality: A combination of the previous two: A runtime-variable array of components where each component can be of a different, runtime-decidable type.

Facilitate multiple alternative configurations.

In virtually every software development endeavor there is a core system design which is materialized in a number of variations to cover different needs. For example, a debug build vs. a release build; a testing build vs. a production build; a build with or without instrumentation; a build with hardware emulation vs. a build targeting the actual hardware; etc. Therefore, the toolset must facilitate the expression of alternative configurations so that each configuration can be defined authoritatively.

Be accessible and attractive.

The extent and speed by which a new software development technology is adopted greatly depends on how accessible and attractive the technology is. To this end:

The core toolset must be free and open source software. (Profit may be made from additional, optional tools, such as a visual editor.) This also means that the toolset must be cross-platform, executable software rather than a cloud offering.

A clear distance must be kept from unattractive technologies like UML, XML, etc.

The literature around the toolset must avoid alienating terms such as "enterprise software", "standards committee", "industry specifications consortium", etc.

Note: this post supersedes michael.gr - The Deployable Design Document.

Scratch (please ignore)

How to Become a Great Software Architect • Eberhard Wolff • GOTO 2019

Ivory tower architecture, detachment from reality, no impact

Relationship between architect and self-organizing agile team

Hopelessness of enforcing the architecture

Architects who also code vs. who do not code

Software Architecture vs Code • Simon Brown • GOTO 2014

UML is dead

Multitude of examples of preposterous approaches to architecture diagrams

Mismatch between coding and architecture

Need for a language that has components as a first-class thing

Definition of component: a bunch of related stuff with a nice clean interface (something you can substitute out)

The C4 model (System Context, Containers, Components, Classes)

"I am working on a tool to [shoot?] through the code base, extract components or services or microservices, and then automatically draw the picture based on relationships and dependencies".

2023-12-23

Towards Authoritative Technical Software Design

Abstract

Summary

Prior Art

A more detailed look at the problem

Towards a solution

No comments:

Post a Comment