2023-12-23

Towards Authoritative Technical Software Design


Abstract

In this paper we examine the long-standing need within the Software Engineering Discipline for technical design documents that are authoritative. A design document is authoritative if there exist technical means of materializing it as a running software system, thus guaranteeing that the end result is indeed precisely as described by the design. We notice the scarcity and inadequacy of existing solutions, we look into the difficulties involved in the creation of such documents, and we conclude with some realizations on what it would take to come up with a solution that works.

(Useful pre-reading: About these papers)

Summary

The usual means by which we describe the architecture of software systems today are whiteboard, paper, or in the best case some general-purpose box-and-arrow drawing application. These means of design are distanced from the engineering entities that they are dealing with, because they lack technical means of accomplishing the following:

  • Informing the design with what components are available for inclusion.
  • Restricting the design to only valid combinations of such components.
  • Materializing the design into a running software system.

As a result, designs made using such means are non-authoritative. They tend to have arbitrary notation, arbitrary content, arbitrary semantics, and even arbitrary levels of abstraction, making them works of art rather than works of engineering: they are nothing but mere suggestive sketches of the vague wishes of the architects, bearing no necessary relationship to reality.

Software engineers and operations engineers do somehow manage to build workable systems out of unworkable designs, but in doing so they engage in improvisation, thus creating systems whose structure is defined not by the design document, but instead by the source code, and various scripts and configuration files scattered all over the place. This is practically equivalent to saying that knowledge of the structure of the end-system only exists in the minds of some engineers.

Ideally, the technical design document should be the single authoritative source of truth for the structure of the system that it describes, but we are not there yet.

Prior Art

In long-established engineering disciplines such as mechanical, electrical, civil, etc., design work has been facilitated by Computer-Aided Engineering (CAE) tools (W) for several decades now. In electronic engineering, which is the discipline from which most parallels can be drawn to software engineering, virtually all design work since the 1980s is being done using Electronic Design Automation (EDA) / Electronic Computer-Aided Design (ECAD) tools (W). These tools have revolutionized the drafting process by using computers to develop, modify, and optimize product designs, as well as to perform simulation and validation of products before they are built.

The documents created by ECAD tools are great for communicating designs to humans: a newly hired electronic engineer begins their first day at work by studying schematic diagrams, and before the end of the working day they may be able to pick up the soldering iron and start doing productive work. Contrast this with software engineering, where a new hire usually cannot be productive before spending weeks studying source code and documentation, and having numerous knowledge transfer meetings with senior engineers who already know the system.

Most importantly, ECAD tools minimize human error by helping to bridge the gap from the physical world to the design, and from the design back to the physical world. The tools have built-in knowledge of electronic components available for inclusion in a design, and electronic manufacturing has advanced to the point where a design document can be turned into a functioning circuit board with practically zero human intervention. Thus, electronic design documents today are authoritative: the end products are accurately described by their designs.

Unfortunately, thus far, in the software engineering discipline we have been following a different path, where design documents are scarce, and authoritative design documents are completely non-existent. This situation has been allowed to go on for so long, largely because we already have a certain other kind of document which is authoritative, and this is the source code.

However, source code is an implementation, or at best a detailed technical description, but certainly not a design. To say that the design of a two-million-line-of-source-code software system is the two million lines of source code that make up that system is equivalent to saying that the design of the Great Wall of China is the list of every single brick that makes up the Great Wall of China. A design is supposed to list components, and to show how they are combined, but not to delve past the level of detail of the component, and certainly not down to the level of the code. Unfortunately, we do not have that in software engineering, certainly not in any form which is authoritative.

Through the years, many Computer-Aided Software Engineering (CASE) tools (W) have been developed, but it would be a mistake to imagine even for a moment that they come anywhere close to offering a solution for authoritative software design:

  • A plethora of Integrated Development Environments (IDEs) (W) have been created, but they are restricted to fancy text editors for source code, form builders for user interfaces, and perhaps a few ways of illustrating, but not defining, various aspects of a software system under development, such as class diagrams, dependency diagrams, call trees, etc.
  • In the Microsoft world there is Visio, which was never developed past the stage of yet another non-authoritative design tool, and products with deceitful titles alluding to design, (e.g. Microsoft Visual C++,) which do not actually deliver any means of design, let alone authoritative design. (See michael.gr - On Microsoft "Visual" products.)
  • Several Visual Programming Languages (W) have been created, such as Snap!, Scratch, EduBlocks, Blockly, etc., which do in fact produce runnable software, but they are structurally equivalent to program code, so they express implementations rather than designs. (See michael.gr - On Visual Programming Languages.)
  • A multitude of Specification Languages (W), Modelling Languages (W), and Architecture Description Languages (W) have been created, most of which are virtually unknown outside of the -usually academic- circles that invented them. Of those that are somewhat known, most see little practical application in the real world. A few that might be noteworthy are:
    • The Unified Modelling Language (UML) (W) aims to standardize at least the notation of diagrams, but it is very cumbersome to work with, so most people would rather write code. Furthermore, there exist no tools that actually do something useful with UML diagrams, so its application is limited to modelling, as its name implies; it cannot be used for authoritative design. (See michael.gr - On UML.)
    • The Specification and Description Language (SDL) (W) is seeing some use in process control and real-time applications. It is suitable for describing implementations rather than designs.
    • The Business Process Modeling Notation (BPMN) (W) is seeing some use in describing business processes within software systems, but not the software systems themselves.
  • Various reverse-engineering tools have been developed, such as Lucidscale (lucidscale.com) which are invariably restricted to the visualization, exploration, and documentation of existing systems, rather than the design and deployment of new systems, or even the modification of existing systems. Furthermore, they tend to be limited to specific realms, such as the cloud environments of particular vendors.
  • A plethora of technologies have been implemented under the umbrella term Component-Based Software Engineering (CBSE) (W), for which visual design tools could, in principle, be built, but such tools are still virtually non-existent. Furthermore, the technologies tend to suffer from critical limitations such as:
    • Assuming the exclusive use of a particular programming language.
    • Assuming the exclusive use of a particular operating system.
    • Requiring debilitating amounts of bureaucracy to accomplish simple things.
    • Forcing software components to be heavyweight (sometimes as heavyweight as a process.)
    • Requiring communication between components to be done via a particular mechanism such as asynchronous message passing, command shell pipes, etc.
The shortcomings of these technologies are reflected in the limited extent of their adoption, and whatever meager adoption some of them do enjoy can usually be attributed to coercion rather than merit. For example, Microsoft's Object Linking and Embedding (OLE) was aggressively promoted by Microsoft and declared to be the one and only way to accomplish various things under Windows, so people use it because they have to, not because they want to. Nobody does OLE if they can avoid it.
  • C4 Model (W) -- Work in progress.
  • Model-Driven Engineering (W) -- Work in progress.
  • Low Code Development Platforms (W) -- Work in progress.
  • Various infrastructure definition tools like Terraform (W) -- Work in progress.
  • ArchiMate (W) -- Work in progress.
  • SysML (W) -- Work in progress.
  • Rapid Application Development (RAD) tools (W) -- Work in progress.

It is a great paradox of our times that the Software Engineering Discipline is virtually bereft of authoritative design tools, when such tools are the bread and butter of the long-established engineering disciplines.

A more detailed look at the problem

Conventional means of software design suffer from the following shortcomings:

  • Designs often include elements that are not well-defined in engineering terms.
You see this with design documents containing supposedly technical but actually quite nebulous entities such as a data store here, a messaging backbone there, or a remote server over there. None of these entities is concrete enough and unambiguous enough to be suitable for inclusion in a technical design document.
  • Designs often include elements that are completely outside the realm of engineering.
    You see this with design documents containing little human stick-figures representing users, pictures of money representing payments, etc. The presence of such items in a software design usually indicates a confusion between what is a technical design and what is a functional specification.
    • Designs often include elements from wrong levels of abstraction.
    You see this with designs that mix software components with flowcharts, state diagrams, etc. Notwithstanding the fact that these are also boxes connected with arrows, they represent decision-making logic, which is an implementation detail of the component that contains that logic, and as such they have no place in an architecture document.
    You also see this with designs that confuse interfaces with other concepts, such as ownership, containment, inheritance, data flow, etc.
    • Designs are often expressed at an unworkably high level of abstraction.
    The level of abstraction most commonly chosen by software architects is that of a block diagram, which might be suitable for abstract architectural work, but it does not contain enough detail to guarantee the feasibility of the technical design.
    The level of abstraction necessary in order to guarantee feasibility is that of the component diagram, which shows connections between components on specific interfaces.
    Unfortunately, since designs are distanced from the engineering entities that they are dealing with, they do not have enough factual information at their disposal to be able to delve into such a level of detail as necessary for a component diagram.
    • Designs are not informed with what elements are available for incorporation.
    The medium on which designs are expressed usually provides no means of establishing or enforcing a correspondence between an element as it appears in the design, and the actual runnable software module that it represents. This can be okay in the case of modules that have not been developed yet, but more often than not a design intends to incorporate existing, ready-made, reusable modules. In the absence of any technical means for informing the design about existing modules, the design necessarily represents hypotheses and assumptions rather than fact.
    • Designs often prescribe invalid combinations of elements.
    The ways in which a design intends to interconnect components do not necessarily match the ways in which the components can actually be interconnected.
    • A design may assume that a certain component exposes a particular interface while in fact the component does not expose such an interface.
    • A design may prescribe a connection between two components on a certain interface, while in fact the components cannot be connected because the type of the interface exposed by one component is not a valid match for the type of the interface required by the other component.
    • Designs fail to capture dynamic aspects of software systems.
    Conventional means of software design often lack the ability to express dynamic constructs such as:
    • Plurality: Multiple instantiation of a certain module, where the number of instances is decided at runtime.
    • Polymorphism: Fulfilling a certain role by instantiating one of several modules capable of fulfilling that role, where the choice of which module to instantiate is made at runtime.
    • Designs are often incomplete.
    A design may incorporate a component which needs to invoke a certain interface in order to get its job done, but omit incorporating a component which implements that interface.
    In such cases, the software system cannot be deployed as designed, and yet the architects are free to proclaim the design as complete.
    The above list of problems results from the lack of technical means of informing the design with what is available and restricting it to what is possible. Correspondingly, the process of building and deploying software systems is also not informed, nor restricted, via any technical means, by the design. The end-result of all this is the following:
    • Software systems do not necessarily match their designs.
    Even if the technical design happens to describe a software system that can actually be built as described, (which is rarely the case,) there are no technological safeguards to guarantee that it will, so the software engineers and the operations engineers are free to build and deploy a system that bears little or no relationship to the design. Neither the architects, nor the management, have any way of knowing.
    • Software systems diverge from their designs over time.
    Even if the technical design initially matches the deployed software system, (which, again, is almost never the case,) the system is bound to evolve. The design should ideally evolve in tandem, but it rarely does, again because there are no technological safeguards to enforce this; the engineers are free to modify and redeploy the system without updating the design document, and in fact they often do, because it saves them from double book-keeping. Thus, over time, the design bears less and less similarity to reality.

    If, due to the above reasons, you suspect that your technical design document does not correspond to reality, and you would like to know exactly what it is that you have actually deployed and running out there, you have to begin by asking questions to the software engineers and the operations engineers.

    In order to answer your questions, they will in turn have to examine source code, version control histories, build scripts, configuration files, server provisioning scripts, and launch scripts, because the truth is scattered in all those places. In some cases they might even have to try and remember commands that were once issued to bring the system to life. If this sounds a bit like it is held together by shoestrings, it is because it is indeed held together by shoestrings.

    Thus, the information that you will receive will hardly be usable, and even if you manage to collect it all, make sense out of it, and update the design document with it, by the time you are done, the deployed system may have already changed.

    As a result, it is generally impossible at any given moment to know the actual technical design of a deployed software system.

    This is a very sorry state of affairs for the entire software industry to be in.

    Towards a solution

    If we consider all the previously listed problems that plague software design as conventionally practiced, and if we look at how the corresponding problems have been solved in long-established engineering disciplines, we inescapably arrive at the following realization:

    The design of a software system can only be said to accurately describe the actual system as built and deployed if there exist technical means of creating the system directly from the design, with no human intervention.

    In order to automatically create a software system from its design, the design must be semantically valid. This brings us to a second realization:

    The semantic validity of a software design can only be guaranteed if there exist technical means of informing the design with components available for incorporation and restricting the design to only valid ways of combining them.

    The above statements define a software design as authoritative.

    Any attempt to introduce authoritative design documents in the software engineering discipline would necessarily have to borrow concepts from the electronic engineering discipline. This means that the solution must lie within the realm of Component-Based Software Engineering (CBSE), where systems consist of well-defined components, connectable via interfaces using well-defined connectivity rules.

    What we need is a toolset that implements such a paradigm for software. The toolset must have knowledge of available components, knowledge of the interfaces exposed by each component, and rules specifying valid ways of connecting those interfaces. The toolset must then be capable of materializing the design into a running software system.

    The toolset must not repeat the mistakes and suffer from the drawbacks of previous attempts at component-based software engineering. Thus, the toolset must meet the following goals:

    • Facilitate any programming language.

    By this we do not mean that it should be possible to freely mix C++ components with Java components; what we mean is that it should be possible to express in one place a C++ subsystem consisting of C++ components interconnected via C++ interfaces, and in another place a Java subsystem consisting of Java components interconnected via Java interfaces, and at a higher scope to have each these subsystems represented as an individual component, where connections between the two are made via language-agnostic interfaces (e.g. REST) or cross-language interfaces (e.g. JNI, JNA, etc.)

    • Facilitate any scale, from embedded systems to network clouds.

    This means that the nature of a component and the nature of an interface must not be restricted, so that they can receive different definitions at different levels of scale. For example, in a certain system of embedded scale, a component might be a single C++ class exposing C++ interfaces, whereas at the network level of scale a component is likely to be a (physical or virtualized) host exposing TCP interfaces.

    • Guarantee type-safety at any scale.

    Type safety can be carried across different levels of scale by means of parametric polymorphism (generic interfaces.) For example, the type safety of an interface between a client and a server can be facilitated with a construct like Tcp<Rest<AcmeWeather>> which stands for a TCP connection through which we are exchanging REST transactions which abide to the schema of some programmatic interface called "AcmeWeather".

    • Require no extra baggage.

    Components should not be required to include a lot of extra functionality to facilitate their inclusion in a design. Especially at the embedded level, components should ideally include zero bureaucracy.

    This means that a C++ class which accepts as constructor parameters interfaces to invoke and exposes interfaces for invocation by virtue of simply implementing them should be usable in a design as it is, or at most with a very thin wrapper.

    Most or all extra functionality necessary for representing the component within a design should be provided by a separate companion module, which acts as a plugin to the design toolset, and exists only during design-time and deployment-time.

    • Facilitate incremental adoption.

    It should be possible to create an authoritative design document to express the design of a small subsystem within a larger system whose design has not (yet) been based on an authoritative design document.

    In systems of medium scale and above, this may be handled by making the core engine of the toolset available on demand, during runtime, to quickly instantiate and wire together a small subsystem within a larger system.

    In embedded-scale systems, it should be possible to utilize code generation to avoid having the core engine present at runtime.

    • Utilize a text-based document format.

    In software we make heavy use of version control systems, which work best with text files, so the design documents must be text-based. The text format would essentially be a system description language, so it must be programmer-friendly in order to facilitate editing using a text editor or an IDE. A graphical design tool would read text of this language into data structures, allow the visual editing of such data structures, and save them back as text.

    • Facilitate Hierarchical System Composition.

    Some systems are so complex that expressing them in a single design may be inconvenient to the point of being unworkable. To solve this issue, container components are necessary, which encapsulate entire separately-editable designs, and expose some of the interfaces of the nested design as interfaces of their own. Thus, containers can be used to abstract away entire sub-designs into atomic, opaque black-boxes within greater designs.

    • Facilitate dynamic software systems.

    Every non-trivial system has the ability to vary, at runtime, the number of instances of some components in response to changing computation needs, and to choose to instantiate different types of components to handle different needs. Therefore, a toolset aiming to be capable of expressing any kind of design must be capable of expressing, at the very minimum, the following dynamic constructs:

    • Plurality: Multiple instantiation of a certain component, where the number of instances is decided at runtime.
    • Polymorphism: Fulfilling a certain role by instantiating one of several components capable of fulfilling that role, where the choice of which component to instantiate is made at runtime.
    • Polymorphic plurality: A combination of the previous two: A runtime-variable array of components where each component can be of a different, runtime-decidable type.

    • Facilitate multiple alternative configurations.

    In virtually every software development endeavor there is a core system design which is materialized in a number of variations to cover different needs. For example, a debug build vs. a release build; a testing build vs. a production build; a build with or without instrumentation; a build with hardware emulation vs. a build targeting the actual hardware; etc. Therefore, the toolset must facilitate the expression of alternative configurations so that each configuration can be defined authoritatively.

      • Be accessible and attractive.

      The extent and speed by which a new software development technology is adopted greatly depends on how accessible and attractive the technology is. To this end:

      • The core toolset must be free and open source software. (Profit may be made from additional, optional tools, such as a visual editor.) This also means that the toolset must be cross-platform, executable software rather than a cloud offering.
      • A clear distance must be kept from unattractive technologies like UML, XML, etc.
      • The literature around the toolset must avoid alienating terms such as "enterprise software", "standards committee", "industry specifications consortium", etc.


      Note: this post supersedes michael.gr - The Deployable Design Document.


      No comments:

      Post a Comment