Foreword
In my career I have experimented a lot with coding styles, mostly on pet projects at home, but also in workplaces where each developer
was free to code in whatever way they pleased, or in workplaces where I was
the only developer.
My experimentation has been in the direction of achieving maximum objective
clarity and readability, disregarding convention, custom, precedent, and the
shock factor: the fact that a particular style element might be alien to others plays very little role in my evaluation of the objective merits of
the element.
The counter-argument (the argument in favor of following convention) says that whatever benefits might be offered by a coding style cannot possibly outweigh the benefit of presenting others with a familiar coding style. This is of course true, and that's why it makes sense for an organization to choose a traditional coding style. However, I am not a company; I am an individual, and my own projects are mine. Furthermore, my counter-counter-argument is that I firmly believe that tradition is a synonym for progress stopper.
So, over the years I have tried many things, once even radically changing my
coding style in the middle of a project. (Modern IDEs make it very easy to do
so.) Some of the things I tried I later abandoned, others I permanently
adopted.
So, my coding style today is the result of all this experimentation. If it
looks strange to you, keep in mind that every single aspect of it has been
deliberately chosen to be this way by someone who was not always coding like
that, and who one day decided to start coding like that in the firm belief
that this way is objectively better.
In moving on with each of these changes over the years, I had to overcome my
own subjective distaste of the unfamiliar, for the benefit of what I
considered to be objectively better. So, if you decide to judge my coding style, please first ask yourself to what
extent you are willing to overcome the same.
My Very Own™ Coding Style
I use this coding style for languages that belong to the C syntax family, for example C, C++, Java, and C#. These are languages with curly braces, a reduced set of keywords, and a moderate amount of parentheses. I hardly ever program in any other language, but when I do, I apply whatever parts of this coding style are applicable.
- Tabs vs. Spaces: Tabs
- I use tabs for indentation, because this allows different developers to view the code with the amount of indentation that they are accustomed to, without having to reformat the code.
- Spaces should never be used for indentation.
- Tabs should never be used for anything other than indentation.
- Tabular formatting: No
- Tabular formatting refers to inserting spaces within statements in consecutive lines of code to align parts of the statements into columns across those lines of code. So, for example, in statements that are of the form `variable-type variable-name = initializer-expression;` spaces would be inserted after the variable-types to align all the variable-names in a column, and more spaces would be inserted after the variable-names to align all the equals-signs in a column.
- I used to be a big fan of this; however:
- Generics make this less appealing, because most type definitions might be short, but one generic type definition might be very long, resulting in lots of seemingly unnecessary whitespace.
- A change in one line of code may result in re-alignment of many lines around it, and diff tools are not smart enough to account for this, so the possibility of merge conflicts skyrockets.
- Thus, at some point my verdict became to drop tabular formatting.
- Spaces:
- Before or after unary operators: Never
- Around binary operators: Always
- Around ternary operators: Always
- Before a comma: Never
- After a comma: Always
- Before opening parenthesis of function argument list: Never
- Before opening parenthesis of flow-control keyword: Never
- Inside parenthesized expressions: Never
- Around parameter lists: Always
- This means that a function call must look like this: `foo( a, b );` Note that there is a space after `(` and a space before `)`.
- This applies not only to function calls, but also to function declarations and to keywords that accept parameters.
- Parameterless functions can still be coded like this: `foo();` because the rule is carefully worded to call for spaces around parameter lists, not spaces inside parentheses. When you are invoking a parameterless function there is no parameter list, therefore no spaces.
- Note that although parameter lists require spaces, parenthesized expressions require no spaces, and therein lies the advantage of this pair of rules: it suddenly becomes clear which parenthesis belongs to a function call, and which parenthesis belongs to an expression. For example, passing an expression as a parameter to a function looks like this: `foo( a, (b + c) );`
- Note that certain C# constructs like `typeof()` and `nameof()` are expressions, not functions, therefore their arguments must not be padded with spaces.
- Right Margin Column: 120
- In May of 2020 Linus Torvalds declared that the number of characters per line in the Code Style of the Linux Kernel was to be increased from 80 characters to 100 characters. That's laughable. We have had widescreen monitors since the beginning of the century. We can easily do 120 characters. I sometimes do 160 characters.
- Hard right margin: No
- The right margin is not meant to be a hard limit: if a line needs to be longer, make it longer. It is fine to push uninteresting stuff off the screen horizontally in order to fit more interesting stuff inside the screen vertically.
- New line after attributes (C#) / annotations (Java): Never
- This may push the function definition quite a bit far to the right, and that's fine.
- If a function has lots and lots of attributes/annotations, it might look very ugly, but that's okay, because it happens very rarely, and when it does happen, maybe that is exactly how it should be: beautiful things should look beautiful, and ugly things should look ugly.
- Empty lines before a block-style comment: One
- If a block-style comment appears in code, there must always be a blank line before it.
- This way, we are clearly indicating that the comment refers to the following line.
- Empty lines within functions: Zero
- Quite often programmers like to use blank lines to visually separate pieces of code that are conceptually different. The problem is, the blank line gives no hint about the concepts involved, so it is entirely useless to anyone but the person who inserted it. If it is worth leaving a blank line, then it is worth adding a block-style comment explaining why, in which case a blank line before the comment is also necessary due to a previous rule. Better yet, move the conceptually different code into a different function, and give that function a descriptive name, so that you need neither comment nor blank line.
- Empty lines before a function: One
- The rule which requires a blank line before a block comment covers all the cases where a function is preceded by a block comment that describes the function. However, quite often functions have descriptive names, rendering explanatory comments unnecessary. For these cases, we mandate separating functions with a blank line.
- Empty lines between fields: Zero
- If you really need a blank line between two fields, you must insert a block comment.
- Empty lines anywhere else: Zero
- Some people have the habit of leaving one or more blank lines in various odd places according to some ad-hoc rules that exist only in their head. The problem is, it is impossible to teach such rules to an automatic reformatting tool. Therefore, there shall be no such rules. There should never be any spurious blank lines anywhere.
- Curly braces: Allman
- See http://en.wikipedia.org/wiki/Indent_style#Allman_style
- Each opening brace and each closing brace is on a separate line, the braces are at the same indentation level as the controlling statement, and the code in the block is one indentation level deeper.
- Luckily, this is the curly brace style of C#.
- Unluckily, this is not the curly brace style of Java.
- I do not care; this is my coding style even when I code in Java. The Egyptian curly brace style which is so popular in the Java world is absolutely retarded.
- Braces on single statement blocks: Never (unless the language requires them)
- The "always" choice seems to be very popular; that's retarded.
- The "sometimes" choice also seems to be popular, but I strive for consistency.
- Note that in some languages some keywords have been introduced that require curly braces even if the controlled block consists of a single statement, for example the try-catch-finally clause in C++, Java, and C#. I greatly resent this.
- Nesting: Always consistent*
- Some people like writing quick one liners, for example `if( x ) return 0;` all in one line. That's unacceptable.
- Some people refrain from nesting the `case` labels in a `switch` statement, or if they do, then they refrain from nesting the code under the `case` labels. That's unacceptable.
- In C#, people quite often refrain from nesting the classes within their namespaces. That's unacceptable.
- In C#, people quite often do not nest cascaded `using` statements. That's unacceptable.
- The only case where I sometimes violate this rule, and I am not yet completely decided on how to go about it, is with single statement functions in Java, which I sometimes code in one line, not because I believe this is correct, but because I am expressing a wish that Java would offer a functional style of declaring functions the way C# does.
- Type identifier casing: SentenceCase
- Even if the type is private.
- Constant identifier casing: SentenceCase
- Even if the constant is private.
- Private member identifier casing: camelCase
- A very popular choice is prefixing the identifier with `_` or `m_`; that's unacceptable.
- Public member identifier casing:
- C#: SentenceCase
- Java: camelCase
- The camelCase choice of Java is retarded, but it would be too heretic even for me to go against it, mainly because there exist tools that use reflection to guess what methods are getters and setters, and everything goes haywire if the capitalization is not what these tools expect.
- Static fields: Same as other fields
- Note: this explicitly means that static fields must not be named differently from other fields. Some people like doing weird things like prefixing static fields with `s_`. That's not only mighty ugly, but also entirely unnecessary, because any half-decent IDE will color-code static fields for you.
- Acronyms: SentenceCase
- In other words, never use "GUID"; always use "Guid". The acronym becomes a word, so that it can be added to the spell-checker.
- Speaking of spell checkers:
- The spell-checker must always be on
- Every commit must pass inspection by the spell-checker
- The spell-checker wordlist must be committed like any other file
- The spell-checker wordlist must pass code review like anything else.
- That's how the quality of the codebase can be protected despite contributions from people with poor command of the English language.
- Explicit `this`: Never
- Unless a field is receiving its value from a method or constructor parameter, in which case the parameter must have the exact same name as the field, and subsequently `this` is necessary in order to refer to the field.
- Use of `var`: Rarely
- Only for non-trivial types, and only when the type is obvious.
- Of course, you might ask, when is the type obvious? The answer is simple: the type is obvious only when the name of the type is present on the right side of the assignment.
- Naming of files and classes: One class per file, exact same name
- In Java this is standard, but there is one exception:
- Java makes it impossible to access constructor parameters from field initializers. The solution to this is to pass the constructor parameter to the superclass, so that it can be stored in a protected member, so that it can be accessed by the field initializers of descendants. Quite often, we invent superclasses for no reason other than to be able to do just that. In these cases, it is okay (preferable even) if the superclass is package-private, and declared in the same file as the descendant.
- In C# one class per file with exact same name is not standard, so it is worth stating. Again, there are a few exceptions:
- It is okay to declare all the classes that make up a small class hierarchy in a single file, as long as the file is named after the base class of the hierarchy.
- It is also okay to declare trivial types like enums and delegates in the same file as the class that they conceptually belong to.
- Namespace imports (C# only): Inside namespace declarations
- Most people import their namespaces outside of their namespace declarations. This style guide mandates the opposite: namespaces must be imported inside namespace declarations. In other words, first we open our namespace, then we declare our imports, then we declare our class.
- This is in accordance with the Principle of Smallest Scope, i.e. any given thing must have the smallest scope that it can possibly have.
- Namespace Aliases (C# only)
- For namespaces defined in the solution: Never
- If you have defined a namespace in your solution, then you should never need to alias it. If it conflicts with a namespace defined outside your solution, then you should alias the external namespace.
- For namespaces defined outside of the solution: Almost always
- I have the habit of aliasing all external namespaces so as to make it evident exactly where each type is coming from. So for example, I never do `using System.Text` and reference `Encoding.UTF8`; I always do `using SysText = System.Text` and then I reference `SysText.Encoding.UTF8`. I make an exception for namespaces `System` and `System.Collections.Generic`.
- Non-ANSI characters: Via Unicode Escape Sequences
- That's because every once in a while some tool will garble non-ANSI characters by accident, and a) that's the kind of error that you will usually have no tests for, while b) even if there is a test, the non-ANSI character in the test might be also garbled, causing the test to pass, while it should fail.
- Miscellaneous
- If something can be private, it must be private.
- If something can be final/readonly, it must be final/readonly.
- If something can be final/sealed, it must be final/sealed.
- If something can be of a less-derived type, it must be of a less-derived type
- Unless you want to document something important; for example, you may want to use a `List` instead of a `Collection` to indicate that order matters.
- If a string literal can be replaced with `nameof`, it must be replaced with `nameof`.
- If a pair of parentheses can be omitted, it must be omitted.
- Unless operator precedence is unclear and requires clarification.
- Note that this means that the expression after the `return` keyword must never be parenthesized.
- Overriding methods must not have documentation comments. The documentation comment of an override is the documentation comment of the method it overrides.
No comments:
Post a Comment