2021-10-04

What is wrong with Java


This is part of a series of posts in which I am documenting what is wrong with certain popular programming languages that I am (more or less) familiar with.  The aim of these posts is to support a future post in which I will be describing what the ideal programming language would look like for me.  

I will be amending and revising these texts over time.

(Useful pre-reading: About these papers)

What is wrong with Java:

  • The garbage collector.
  • Curly braces.
  • Primitive types are cumbersome.
    • Each one of the primitive types `boolean`, `byte`, `char`, `short`, `int`, `float`, `long`, `double` is a snowflake which must always be handled differently from the others.  They cannot all be treated uniformly as value types.
    • To allow for at least some uniform treatment, one must keep converting back and forth between them and their corresponding wrapper classes (`Boolean`, `Byte`, `Character`, `Short`, `Integer`, `Float`, `Long` and `Double`) which is clunky and inelegant.
  • Still no user-defined value types in 2022.
    • Since version 14, Java supports records, but they are still allocated on the heap and passed by reference. So, an array of 1000 records which would be a single memory block in C# is 1001 memory blocks in Java.
    • Allegedly, a future version of Java will support value types, but knowing how bytecode is structured and how the JVM works:
      • This is going to be extremely difficult to achieve 
      • Will probably be a cumbersome addition to the language
      • The existing awkward primitive types will of course stay with us forever in the name of backwards compatibility.
  • Still no value tuples in 2022.  C# has been doing a pretty good job at that.
  • No conditional compilation.
    • Cannot even declare a constant whose value is externally supplied.
  • Generics are decent, but still lacking.
    • Type erasure allows unsafe constructs which may result in "heap pollution".
    • Type erasure makes it impossible to disambiguate entities based on their generic parameters, thus making it impossible to overload based on generics.  This forces us to give artificially different names to entities that would ideally share the same name.
    • Working with generics inevitably requires either littering the code with `@SuppressWarnings( "unchecked" )`, or entirely disabling the "unchecked" warning, which opens up another can of worms.
  • No C#-style properties.
  • No operator overloading.
    • In general, the language design philosophy of Java seems to be overly protectionist towards the idiot programmer, at the expense of the expert programmer who just can't have a feature that they want because it would be potentially dangerous for the idiot. This is roughly the same narrow-minded protectionist design philosophy that has been employed by Apple and has given rise to what is known as "the mac user", which is a code-word for "idiot".
  • No namespaces.
    • Packages are ill-conceived and lame:
      • Each source file must be associated with one and only one package. So, if the source file is to contain multiple classes, all these classes must belong to the same package.
      • Each source file may contain no more than one public class. Any additional classes must be package-private.
      • There is no equivalent to the namespace aliases of C#.
      • Packages are unrelated to packaging. (See lack of assemblies.)
      • Packages (and the lack of assemblies) force programmers to cram an impossibly large number of classes within the same package so as to be able to keep some of them package-private, because the moment you try moving a class into a separate package to reduce the clutter, this class must now become public, so as to remain accessible by classes from the original package.
      • Despite the fact that package names look hierarchical, packages are not at all hierarchical: 
        • Each package is completely separate from all other packages.
        • There exists no special relationship between two packages by virtue of their names being one nested within the other.  (In C#, a namespace inherits from all namespaces in its ancestry line.)
        • It is impossible to address a class in a sub-package with a partial (relative) sub-package name.
        • This, in combination with the fact that there is no equivalent to namespace aliases, means that two classes with identical names in different packages can only be handled using fully qualified class names.
        • Since fully qualified class names are cumbersome to work with, most people resort to assigning globally unique names to their classes.
          • This is very clunky, and it looks retarded, because it essentially results in class names that contain the name of their package.
          • This is an uphill struggle and never quite successful, because you might give unique names to all your classes, but you might use some library with class names that conflict with yours, so there will always be some fully qualified class names around.
  • No C#-style assemblies.
    • Individual class files scattered all over the place are cumbersome to work with.
    • The filesystem/jar-file duality is very cumbersome to work with.
    • Jar files only deal with packaging; they offer no support for specifying what is exported and what is kept private.
    • Modules were added as an after-thought, and they give some control over what to publish and what to keep private, but the unit of publication is still the package, not the class, which means that package-private classes are still necessary, which in turn means that huge packages are still necessary.
  • Class loaders are lame.
    • They, as well as many other language features, are a relic from the java web applet era.
    • They are very cumbersome to work with.
    • They unnecessarily impose a significant performance penalty by doing a lot of work on a per-class basis instead of a per-module basis.
  • Lame access rules.
    • Everything that is package-private is also protected. (Duh?)
    • Inner classes have access to private members of the enclosing class; this is probably okay; however, the enclosing class also has access to private members of inner classes, which is retarded.
  • Member initializers have no access to constructor parameters.
  • Member initializers execute between the invocation of the super constructor and the statement that immediately follows it, which technically makes sense, but these jumps in the flow of execution are completely counter-intuitive to the novice programmer, who is precisely the type of programmer that the language caters to. Scala has shown how to do this right.
  • The syntax for invoking the super constructor suggests that one might be able to insert statements before the call to super, but this is not the case. (The deviation from the C++ syntax would be justifiable if the new syntax had something to offer, but it does not.) The language falls short of doing the one sensible thing that this syntax would allow, which would be to be able to put code before the call to super, as long as this code does not try to access `this`, for example assertions on the constructor parameters before passing them to super; but no, you cannot do that.
  • No named / optional parameters to functions. (No default parameter values.)
  • Default interface methods cannot be final.
    • Any class implementing an interface may inadvertently re-implement functionality which has already been provided by a default method.
    • An interface cannot guarantee that a certain method will have a specific behavior because any class implementing that interface may override that behavior.
  • Interface methods cannot be protected.
    • It is sometimes useful to have a certain interface method that is only visible by implementing classes, but no, we cannot have that, all methods must be public.
  • Interface methods cannot be private.
    • It is sometimes useful to have a certain interface method that is only visible by default interface methods within that same interface, but no, we cannot have that, all methods must be public.
  • Lambda argument names are not allowed to mask the names of variables of the enclosing scope. This is very lame because:
    • It forces the programmer to invent new, unnatural names for lambda arguments.
    • Variables of the enclosing scope cannot be masked, so they remain accessible within the lambda, and can thus be accessed by mistake, leading to bugs that are very hard to detect.
  • No member literals and not even a 'nameof' operator.
  • No nullable/non-nullable semantics for reference types. (C# 8 does a fairly decent job at that.)
  • No variable declarations inside assignment expressions. (`while( (var line = next()) != null )`)
  • No nested methods.
    • You can have a function-local class, but you cannot have a function-local function. The workaround is to declare and instantiate a function-local anonymous class containing the nested method, but this is cumbersome, unnecessarily verbose, and incurs a performance penalty.
  • No redefining of names (as with the `new` keyword of C#)
  • The long history of the language inevitably means that there are some bad choices of yore which interfere with newly introduced features. For example:
    • The ability to use the same name for a field and a function never really offered anything of value, but it did necessitate the introduction of the cumbersome double-colon operator when function references were added to the language.
  • Checked exceptions.
    • They were a good idea in principle, but turned out to be too cumbersome in practice. 
    • With the advent of lambdas, they represent nothing but hindrance.
  • Collecting a stack trace (and therefore also throwing an exception) might not be as excruciatingly slow as it is in C#, but it is still unnecessarily slow, and prohibitively slow for some purposes.
  • No feature like the __FILE__ and __LINE__ intrinsic macros of C++. 
    • There is no way to obtain this information without walking the stack, and is especially problematic since  walking the stack is unreasonably slow.
  • The built-in collection model is very outdated and lame.
    • Arrays do not implement any of the collection interfaces so they always need special handling.
    • The `Iterator` interface is lame.
      • The `hasNext()` and `next()` methods are unusable in a for-loop.  (A for-each loop can be used with an `Iterable`, but then you have no access to the `Iterator`.)
      • A filtering iterator cannot be implemented without cumbersome look-ahead logic and then it is impossible to use it for removing items from the collection because looking ahead means that you are always past the item you want to delete.
    • Lack of unmodifiable collection interfaces means no compile-time readonlyness. 
      • Every single collection instance looks mutable, since it is implementing an interface that has mutation methods, but quite often is secretly immutable, meaning that if you make the mistake of invoking any of the mutation methods, you will be slapped with a runtime exception.
  • Fluent collections (collection streams) are lame.
    • They are unnecessarily verbose
      • They require every single call chain to begin with a quite superfluous-looking `stream()` operation
      • They almost always have to be ended with an equally superfluous-looking `collect()` operation.
    • They are not particularly extensible because they are entirely based on a single interface (`Stream`). Their only point of extensibility is at the very end of each call chain, by means of custom-written collectors.
    • Collectors are convoluted, so writing one is not trivial.
    • Collection streams work by means of incredibly complex logic behind the scenes, so:
      • They are very difficult to debug.
      • They are noticeably slower than C#-style fluent collection operations even before we consider the collection step at the end.
    • The collection step is tantamount to making an unnecessary safety copy of the information produced by the collection stream chain.
    • Collection streams are unnecessarily convoluted due to the ill-conceived notion that the mechanism used for fluent collection operations should also be usable for parallel collection operations.
  • Various standard classes are implemented in lame ways. For example:
    • All input-output stream classes suffer a performance handicap due to unnecessarily and ill-conceivedly trying to be thread-safe.
    • Input-output functionality is often achievable not via interfaces, but instead via abstract classes with an unnecessarily verbose set of methods, which makes extending them a tedious and error prone endeavor.  (E.g. java.io.Writer, java.io.StreamWriter.)
    • There is no way to attempt parsing a number and obtain an indication as to whether the parsing succeeded or not, without:
      • Suffering the performance penalty of an exception being thrown 
      • Having to write code that catches the exception to take notice that parsing failed.  
(And funnily enough, even though the Java runtime makes liberal use of checked exceptions everywhere, the parse-failed exception is unchecked.)
  • The for-each loop does not do anything about closeable iterators. (The for-each loop of C# properly disposes disposable enumerators.)
  • The try-with-resources statement requires a variable to be defined to hold the closeable object. (The equivalent 'using' statement of C# has no such requirement.)
  • The language runtime if full of always-on error checks instead of using assertions.
  • The inner workings of the language runtime are convoluted, and its performance is hindered, by the operation of various unrequired and arguably ill-conceived mechanisms such as "access checking", "bytecode verification", "protection domains", and even some optional "security manager". (The security manager is finally being deprecated as of Java 17.)
  • No compiler-enforced method purity.
    • It is not possible to declare a method as pure and have the compiler enforce that it, and any overrides of it, are in fact pure, 
  • No compiler-enforced immutability.
    • It is not possible to declare a class as immutable and have the compiler enforce that it, and any derived classes, are in fact immutable.
  • Still no string interpolation in 2021.
  • Inconsistent rules for curly braces.
    • In most cases, curly braces are unnecessary unless the scope they enclose consists of more than one statement.
    • However, the curly braces are mandatory in some arbitrary cases, e.g. for method bodies and for try-catch-finally statements.
  • Lame style conventions, for example:
    • Underscores are inadvisable, which is retared.
    • Methods, fields, variables, and parameters are to be named in camelCase, which is retarded.
    • Package names are to be named in all lowercase, which is retarded.
    • The curly brace style is to be egyptian, which is retarded.
  • No means of programmatically breaking into the debugger as per the `System.Diagnostics.Debugger.Break()` method of C#.
  • Class<T> is a misnomer. It is actually a type, because it may stand for either a class or an interface. C# does better here, too.
Note: the above list of disadvantages is kind of long, because I am intimately familiar with the language.

Feedback is more than welcome: you'd be doing me a favor. However, be aware that blogger sometimes eats comments, so be sure to save your text before submitting it. If blogger eats your comment, please e-mail it to me.

No comments:

Post a Comment