michael.gr: software engineering

Showing posts with label software engineering. Show all posts

2021-01-16

The MVVM architectural design pattern

Here is a brief technical explanation of MVVM, which contains enough detail (borrowed from its WPF implementation) and examples to allow the reader to grasp how it actually works.

Object Lifetime Awareness

The Thinker (French: Le Penseur) by Auguste Rodin (From Wikipedia)

Abstract

Garbage collectors have given us a false sense of security with respect to what happens to an object once we stop thinking about it. The assumption is that it will be magically taken care of, but this does not always go as hoped, resulting in memory leaks and bugs due to failure to perform necessary cleanup. Tools for troubleshooting such problems are scarce, and not particularly helpful, so finding and fixing such problems is notoriously difficult.

A methodology is presented, which differs from current widespread practices, for maintaining awareness of, and exercising full deterministic control over, the lifetime of certain objects in a garbage-collected environment. We issue hard errors in the event of misuse, and accurate diagnostic messages in the event of omissions, thus improving the robustness of software and lessening the troubleshooting burden.

Coherence: The Assertable Lock

Abstract

A Software Design Pattern for concurrent systems which makes race conditions something that can be asserted against and thus deterministically eliminated rather than stochastically reduced or minimized. (Subject, of course, to the thoroughness of the assertions.)

Image by reginasphotos from pixabay.com

A description of the problem

Every Software Engineer who has dealt with concurrency knows that it is hard. The bane of concurrency is race conditions: when a thread accesses data without taking into account the fact that the data is shared with other concurrently running threads which may alter that data at any unforeseeable moment in time.

The famous "Could not load file or assembly or one of its dependencies" error message

If you have ever done any software development under Microsoft Windows you have probably come across this famous error message: "System.IO.FileNotFoundException : Could not load file or assembly 'Acme.dll' or one of its dependencies. The specified module could not be found."

Modern software makes heavy use of dynamic link libraries, and the problem with this kind of libraries is that for various reasons they might not be there when you need them, resulting in runtime errors. This is the runtime error you get under Windows when this happens.

Naturally, when you see this message, the first thing to do is to check whether Acme.dll is there, and what you usually discover is that the file is indeed there. When dealing with computers, most error messages that you come across tend to leave some room for troubleshooting, but when the system is reporting that a certain file does not exist on your very own filesystem, while the file is most certainly there, the situation seems really hopeless. You are stymied.

Domain Oriented Programming

A Software Design Pattern which brings the principles of Inheritance, Encapsulation and Polymorphism one level up from the Class level to the Subsystem level, and offers a way of realizing relationships between classes so as to achieve dependency inversion by means of propagation instead of injection.

Part 1: Dependency Inversion

The software that we write often invokes other software to get parts of the job done. These are known as Services or Dependencies. If Class A is making use of some Class B, then Class A depends on Class B, so Class B is a dependency of Class A.

The principle of Dependency Inversion (⬀) says that a class should not contain any direct calls to specific instances of any of its dependencies. Instead, it should receive these instances as parameters during initialization.

That's all very nice, but passing dependencies around can become quite a complicated business, and in large systems it can become a nightmare.

On Validation vs. Error Checking

Let me start with a couple of pedantic definitions; stay with me, the beef follows right afterwards.

Conventional wisdom says that validation is different from error checking.

Validation is performed at the boundaries of a system, to check the validity of incoming data, which is at all times presumed to be potentially invalid. When invalid data is detected, validation is supposed to reject it. Validation is supposed to be always on, you cannot switch it off on release builds and only have it enabled on debug builds.
Error checking, on the other hand, is performed inside a system, checking against conditions that should never occur, to keep making sure that everything is working as intended. In the event that an error is encountered, the intent is to signal a catastrophic failure. Essentially, the term Error Checking is shorthand for Internal Error Checking. It can be implemented using assertions, thus being active on the debug build only, and having a net cost of zero on the release build.

So far so good, right?

Index of notable GitHub projects

Intertwine (C#, Java)

A framework for automatically converting method invocations of any programmatic interface into a single-method normal form and converting back to invocations of the original interface.
https://blog.michael.gr/2022/12/intertwine.html
For C#: https://github.com/mikenakis/IntertwineCSharp
For Java: https://github.com/mikenakis/Public/tree/master/intertwine

VsDebugLogger (C#)

Speeds up Visual Studio debug output by orders of magnitude.
https://github.com/mikenakis/VsDebugLogger

bathyscaphe (Java)

Deep immutability assessment (and coming soon: thread-safety assessment) for java objects.
https://blog.michael.gr/2022/05/bathyscaphe.html

testana (Java)

A command-line utility for running only those tests that actually need to run.
https://blog.michael.gr/2018/04/github-project-mikenakis-testana.html

classdump (Java)

A command-line utility for dumping the contents of class files.
https://blog.michael.gr/2018/04/github-project-classdump.html

bytecode (Java)

A lightweight framework for manipulating JVM bytecode.
https://blog.michael.gr/2018/04/github-project-bytecode.html

2019-12-01

The case for software testing

What to reply to a non-programmer who thinks that testing is unnecessary or secondary

At some point during his or her career, a programmer might come across the following argument, presented by some colleague, partner, or decision maker:

Since we can always test our software by hand, we do not need to implement Automated Software Testing.

Apparently, I reached that point in my career, so now I need to debate this argument. I decided to be a good internet citizen and publish my thoughts. So, in this post I am going to be deconstructing that argument, and demolishing it from every angle that it can be examined. I will be doing so using language that is easy to process by people from outside of our discipline.

In the particular company where that argument was brought forth, there exist mitigating factors which are specific to the product, the customers, and the type of relationship we have with them, all of which make the argument not as unreasonable as it may sound when taken out of context. Even in light of these factors, the argument still deserves to be blown out of the water, but I will not be bothering the reader with the specific situation of this company, so as to ensure that the discussion is applicable to software development in general.

In its more complete form, the argument may go like this:

Medium.com: Psychology of Code Readability by Egon Elbre

This is an article I enjoyed reading. I am in full agreement with every claim made therein. It was very nice to see certain conclusions that I have arrived at in the past being spelled out and illustrated with nice explanations.

https://medium.com/@egonelbre/psychology-of-code-readability-d23b1ff1258a

2018-05-22

Confucius on naming

There is a Chinese proverb which states:

The beginning of wisdom is to call things by their proper name.

This proverb is generally understood to be a summarization and paraphrase of an actual quote from the "Rectification of Names" section of the Analects of Confucius. (See Wikipedia - Rectification of names)

GitHub project: mikenakis-rumination

NOTE:

This project has been retired. The github link does not even work anymore.

This page only serves historical documentation purposes.

Making plain old java objects aware of their own mutations.

The mikenakis-rumination logo.
Based on original from free-illustrations.gatag.net
Used under CC BY License.

GitHub project: mikenakis-testana (Java)

GitHub project: mikenakis-classdump

A command-line utility for dumping the contents of class files.

The mikenakis-classdump logo.
Based on an image found on the interwebz.

GitHub project: mikenakis-bytecode

A lightweight framework for manipulating JVM bytecode.

The mikenakis-bytecode Logo, an old-fashioned coffee grinder.
by Mike Nakis, based on original work by Gregory Sujkowski from the Noun Project.
Used under CC BY License.

My Very Own™ Coding Style

Foreword

In my career I have experimented a lot with coding styles, mostly on pet projects at home, but also in workplaces where each developer was free to code in whatever way they pleased, or in workplaces where I was the only developer.

My experimentation has been in the direction of achieving maximum objective clarity and readability, disregarding convention, custom, precedent, and the shock factor: the fact that a particular style element might be alien to others plays very little role in my evaluation of the objective merits of the element.

The counter-argument (the argument in favor of following convention) says that whatever benefits might be offered by a coding style cannot possibly outweigh the benefit of presenting others with a familiar coding style. This is of course true, and that's why it makes sense for an organization to choose a traditional coding style. However, I am not a company; I am an individual, and my own projects are mine. Furthermore, my counter-counter-argument is that I firmly believe that tradition is a synonym for progress stopper.

So, over the years I have tried many things, once even radically changing my coding style in the middle of a project. (Modern IDEs make it very easy to do so.) Some of the things I tried I later abandoned, others I permanently adopted.

So, my coding style today is the result of all this experimentation. If it looks strange to you, keep in mind that every single aspect of it has been deliberately chosen to be this way by someone who was not always coding like that, and who one day decided to start coding like that in the firm belief that this way is objectively better.

In moving on with each of these changes over the years, I had to overcome my own subjective distaste of the unfamiliar, for the benefit of what I considered to be objectively better. So, if you decide to judge my coding style, please first ask yourself to what extent you are willing to overcome the same.

My Very Own™ Coding Style

I use this coding style for languages that belong to the C syntax family, for example C, C++, Java, and C#. These are languages with curly braces, a reduced set of keywords, and a moderate amount of parentheses. I hardly ever program in any other language, but when I do, I apply whatever parts of this coding style are applicable.

Tabs vs. Spaces: Tabs

I use tabs for indentation, because this allows different developers to view the code with the amount of indentation that they are accustomed to, without having to reformat the code.

Spaces should never be used for indentation.
Tabs should never be used for anything other than indentation.

Tabular formatting: No

Tabular formatting refers to inserting spaces within statements in consecutive lines of code to align parts of the statements into columns across those lines of code. So, for example, in statements that are of the form `variable-type variable-name = initializer-expression;` spaces would be inserted after the variable-types to align all the variable-names in a column, and more spaces would be inserted after the variable-names to align all the equals-signs in a column.
I used to be a big fan of this; however:

Generics make this less appealing, because most type definitions might be short, but one generic type definition might be very long, resulting in lots of seemingly unnecessary whitespace.
A change in one line of code may result in re-alignment of many lines around it, and diff tools are not smart enough to account for this, so the possibility of merge conflicts skyrockets.

Thus, at some point my verdict became to drop tabular formatting.

Spaces:

Before or after unary operators: Never
Around binary operators: Always
Around ternary operators: Always
Before a comma: Never
After a comma: Always
Before opening parenthesis of function argument list: Never
Before opening parenthesis of flow-control keyword: Never
Inside parenthesized expressions: Never
Around parameter lists: Always

This means that a function call must look like this: `foo( a, b );` Note that there is a space after `(` and a space before `)`.
This applies not only to function calls, but also to function declarations and to keywords that accept parameters.
Parameterless functions can still be coded like this: `foo();` because the rule is carefully worded to call for spaces around parameter lists, not spaces inside parentheses. When you are invoking a parameterless function there is no parameter list, therefore no spaces.
Note that although parameter lists require spaces, parenthesized expressions require no spaces, and therein lies the advantage of this pair of rules: it suddenly becomes clear which parenthesis belongs to a function call, and which parenthesis belongs to an expression. For example, passing an expression as a parameter to a function looks like this: `foo( a, (b + c) );`
Note that certain C# constructs like `typeof()` and `nameof()` are expressions, not functions, therefore their arguments must not be padded with spaces.

Right Margin Column: 120

In May of 2020 Linus Torvalds declared that the number of characters per line in the Code Style of the Linux Kernel was to be increased from 80 characters to 100 characters. That's laughable. We have had widescreen monitors since the beginning of the century. We can easily do 120 characters. I sometimes do 160 characters.

Hard right margin: No

The right margin is not meant to be a hard limit: if a line needs to be longer, make it longer. It is fine to push uninteresting stuff off the screen horizontally in order to fit more interesting stuff inside the screen vertically.

New line after attributes (C#) / annotations (Java): Never

This may push the function definition quite a bit far to the right, and that's fine.
If a function has lots and lots of attributes/annotations, it might look very ugly, but that's okay, because it happens very rarely, and when it does happen, maybe that is exactly how it should be: beautiful things should look beautiful, and ugly things should look ugly.

Empty lines before a block-style comment: One

If a block-style comment appears in code, there must always be a blank line before it.
This way, we are clearly indicating that the comment refers to the following line.

Empty lines within functions: Zero

Quite often programmers like to use blank lines to visually separate pieces of code that are conceptually different. The problem is, the blank line gives no hint about the concepts involved, so it is entirely useless to anyone but the person who inserted it. If it is worth leaving a blank line, then it is worth adding a block-style comment explaining why, in which case a blank line before the comment is also necessary due to a previous rule. Better yet, move the conceptually different code into a different function, and give that function a descriptive name, so that you need neither comment nor blank line.

Empty lines before a function: One

The rule which requires a blank line before a block comment covers all the cases where a function is preceded by a block comment that describes the function. However, quite often functions have descriptive names, rendering explanatory comments unnecessary. For these cases, we mandate separating functions with a blank line.

Empty lines between fields: Zero

If you really need a blank line between two fields, you must insert a block comment.

Empty lines anywhere else: Zero

Some people have the habit of leaving one or more blank lines in various odd places according to some ad-hoc rules that exist only in their head. The problem is, it is impossible to teach such rules to an automatic reformatting tool. Therefore, there shall be no such rules. There should never be any spurious blank lines anywhere.

Curly braces: Allman

See http://en.wikipedia.org/wiki/Indent_style#Allman_style
Each opening brace and each closing brace is on a separate line, the braces are at the same indentation level as the controlling statement, and the code in the block is one indentation level deeper.

Luckily, this is the curly brace style of C#.
Unluckily, this is not the curly brace style of Java.

I do not care; this is my coding style even when I code in Java. The Egyptian curly brace style which is so popular in the Java world is absolutely retarded.

Braces on single statement blocks: Never (unless the language requires them)

The "always" choice seems to be very popular; that's retarded.
The "sometimes" choice also seems to be popular, but I strive for consistency.
Note that in some languages some keywords have been introduced that require curly braces even if the controlled block consists of a single statement, for example the try-catch-finally clause in C++, Java, and C#. I greatly resent this.

Nesting: Always consistent*

Some people like writing quick one liners, for example `if( x ) return 0;` all in one line. That's unacceptable.
Some people refrain from nesting the `case` labels in a `switch` statement, or if they do, then they refrain from nesting the code under the `case` labels. That's unacceptable.
In C#, people quite often refrain from nesting the classes within their namespaces. That's unacceptable.
In C#, people quite often do not nest cascaded `using` statements. That's unacceptable.
The only case where I sometimes violate this rule, and I am not yet completely decided on how to go about it, is with single statement functions in Java, which I sometimes code in one line, not because I believe this is correct, but because I am expressing a wish that Java would offer a functional style of declaring functions the way C# does.

Type identifier casing: SentenceCase

Even if the type is private.

Constant identifier casing: SentenceCase

Even if the constant is private.

Private member identifier casing: camelCase

A very popular choice is prefixing the identifier with `_` or `m_`; that's unacceptable.

Public member identifier casing:

C#: SentenceCase
Java: camelCase

The camelCase choice of Java is retarded, but it would be too heretic even for me to go against it, mainly because there exist tools that use reflection to guess what methods are getters and setters, and everything goes haywire if the capitalization is not what these tools expect.

Static fields: Same as other fields

Note: this explicitly means that static fields must not be named differently from other fields. Some people like doing weird things like prefixing static fields with `s_`. That's not only mighty ugly, but also entirely unnecessary, because any half-decent IDE will color-code static fields for you.

Acronyms: SentenceCase

In other words, never use "GUID"; always use "Guid". The acronym becomes a word, so that it can be added to the spell-checker.
Speaking of spell checkers:

The spell-checker must always be on
Every commit must pass inspection by the spell-checker
The spell-checker wordlist must be committed like any other file
The spell-checker wordlist must pass code review like anything else.

That's how the quality of the codebase can be protected despite contributions from people with poor command of the English language.

Explicit `this`: Never

Unless a field is receiving its value from a method or constructor parameter, in which case the parameter must have the exact same name as the field, and subsequently `this` is necessary in order to refer to the field.

Use of `var`: Rarely

Only for non-trivial types, and only when the type is obvious.
Of course, you might ask, when is the type obvious? The answer is simple: the type is obvious only when the name of the type is present on the right side of the assignment.

Naming of files and classes: One class per file, exact same name

In Java this is standard, but there is one exception:

Java makes it impossible to access constructor parameters from field initializers. The solution to this is to pass the constructor parameter to the superclass, so that it can be stored in a protected member, so that it can be accessed by the field initializers of descendants. Quite often, we invent superclasses for no reason other than to be able to do just that. In these cases, it is okay (preferable even) if the superclass is package-private, and declared in the same file as the descendant.

In C# one class per file with exact same name is not standard, so it is worth stating. Again, there are a few exceptions:

It is okay to declare all the classes that make up a small class hierarchy in a single file, as long as the file is named after the base class of the hierarchy.
It is also okay to declare trivial types like enums and delegates in the same file as the class that they conceptually belong to.

Namespace imports (C# only): Inside namespace declarations

Most people import their namespaces outside of their namespace declarations. This style guide mandates the opposite: namespaces must be imported inside namespace declarations. In other words, first we open our namespace, then we declare our imports, then we declare our class.
This is in accordance with the Principle of Smallest Scope, i.e. any given thing must have the smallest scope that it can possibly have.

Namespace Aliases (C# only)

For namespaces defined in the solution: Never

If you have defined a namespace in your solution, then you should never need to alias it. If it conflicts with a namespace defined outside your solution, then you should alias the external namespace.

For namespaces defined outside of the solution: Almost always

I have the habit of aliasing all external namespaces so as to make it evident exactly where each type is coming from. So for example, I never do `using System.Text` and reference `Encoding.UTF8`; I always do `using SysText = System.Text` and then I reference `SysText.Encoding.UTF8`. I make an exception for namespaces `System` and `System.Collections.Generic`.

Non-ANSI characters: Via Unicode Escape Sequences

That's because every once in a while some tool will garble non-ANSI characters by accident, and a) that's the kind of error that you will usually have no tests for, while b) even if there is a test, the non-ANSI character in the test might be also garbled, causing the test to pass, while it should fail.

Miscellaneous

If something can be private, it must be private.
If something can be final/readonly, it must be final/readonly.
If something can be final/sealed, it must be final/sealed.
If something can be of a less-derived type, it must be of a less-derived type

Unless you want to document something important; for example, you may want to use a `List` instead of a `Collection` to indicate that order matters.

If a string literal can be replaced with `nameof`, it must be replaced with `nameof`.
If a pair of parentheses can be omitted, it must be omitted.

Unless operator precedence is unclear and requires clarification.
Note that this means that the expression after the `return` keyword must never be parenthesized.

Overriding methods must not have documentation comments. The documentation comment of an override is the documentation comment of the method it overrides.

2018-04-04

GitHub project: mikenakis-agentclaire

NOTE:

This project has been retired. The github link does not even work anymore.

This page only serves historical documentation purposes.

A Java Agent to end all Java Agents.

The mikenakis-agentclaire logo
based on a piece of clip art found on the interwebz.

Open Source but No License

I have posted some small projects of mine on GitHub, mainly so that prospective employers can appreciate my skills. I am not quite ready to truly open source them, so I published them under "No License". This means that I remain the exclusive copyright holder of these creative works, and nobody else can use, copy, distribute, or modify them in any way, shape or form. More information here: choosealicense.com - "No License" (https://choosealicense.com/no-permission/).

Pretty much the only thing one can legally do with these creative works is view their source code and admire it.

GitHub says that one can also make a copy of my projects, (called fork in GitHub parlance,) but I am not sure what one would gain from doing that, because you cannot legally do anything with the forked code other than view it and admire it. Even more information here: Open Source SE - GitHub's “forking right” and “All rights reserved” projects (https://opensource.stackexchange.com/q/1154/10201)

(Okay, if you compile any of my projects and run it once or twice in order to check it out, I promise I will turn a blind eye.)

If you want to do anything more with any of these projects, please contact me.

2018-04-02

On JUnit's random order of test method execution

This is a rant about JUnit, or more precisely, a rant about JUnit's inability to execute test methods in natural method order.

Definition: Natural method order is the order in which methods appear in the source file.

What is the problem?

Douglas Crockford talking nonsense

Here is Douglas Crockford,
talking patent nonsense about Java and about exceptions,
neither of which he understands, obviously.

Start playing at 27':42''. The insanity lasts until 32':00''.
Enjoy responsibly.

2018-02-05

On Code Craftsmanship

I will try to make a list of items here, but I could probably write a book on this.

2021-01-16

2021-01-03

Abstract

2020-12-21

Abstract

A description of the problem

2020-12-19

2020-06-26

Part 1: Dependency Inversion

2020-05-30

2020-04-01

Intertwine (C#, Java)

VsDebugLogger (C#)

bathyscaphe (Java)

testana (Java)

classdump (Java)

bytecode (Java)

2019-12-01

What to reply to a non-programmer who thinks that testing is unnecessary or secondary

2019-01-29

2018-05-22

2018-04-11

2018-04-09

2018-04-08

2018-04-07

2018-04-05

Foreword

My Very Own™ Coding Style

2018-04-04

2018-04-03

2018-04-02

What is the problem?

2018-03-24

2018-02-05