2020-05-30

On Validation vs. Error Checking

Let me start with a couple of pedantic definitions; stay with me, the beef follows right afterwards.

Conventional wisdom says that validation is different from error checking.

Validation is performed at the boundaries of a system, to check the validity of incoming data, which is at all times presumed to be potentially invalid. When invalid data is detected, validation is supposed to reject it by returning an appropriate result, not throwing an exception. Validation is supposed to be always on, you cannot switch it off on release builds and only have it enabled on debug builds.

Error checking, on the other hand, is performed inside a system, checking against conditions that should never occur, to keep making sure that everything is working as intended. In the event that an error is encountered, the intent is to signal a catastrophic failure (throw an exception) instead of causing some result to be returned. Essentially, the term Error Checking is shorthand for Internal Error Checking.  It can be implemented using assertions, thus being active on the debug build only, and having a net cost of zero on the release build.

So far so good, right?

(Useful pre-reading: About these papers)

Well, the problem with this conventional view of validation vs error checking is that it heavily relies on the notion of "system boundaries", which is not a well-defined notion.  Unless you are an application programmer, whatever you are building will in all likelihood be a subsystem of a larger system, and that system will in turn be a subsystem of an even larger system, and so on. Therefore, what you think of as the boundaries of your system will never be the actual boundaries of the actual system. You cannot have any claim to knowledge of the boundaries of any system that might incorporate your little creation as a component of it.

As a reaction to this uncertainty, most programmers maintain a self-centered view of the component they are developing as the system, and a short-sighted view of the boundaries of their component as the system boundaries. So, on those boundaries they keep doing validation.

Here is what's wrong with that:
  1. Your subsystem will be embedded in a larger system, which will be doing its best to always supply your subsystem with valid data. This larger system will be tested by its creators, and will be known to work correctly. This means that it will always supply your system with valid data, so your validation will be useless, and it will just be wasting time on the release build.
  2. The validation results returned by your boundary methods will have to somehow be dealt with by the containing system, since ignoring results is considered a terrible practice, even when nothing is expected to go wrong. So, you are forcing the caller to litter their code with checks for your validation results.
  3. However, since the caller does not expect anything to go wrong, they will not be able to do anything other than throw an exception in the event that you return a validation failure result. So, not only you are forcing the caller to litter their code with checks for validation results, but these checks in turn will never be triggered, and the exceptions will never be thrown. Think about it: this is code that will never be covered by any coverage run.
  4. Even in the extremely unlikely event that the containing system will in fact supply your component with invalid input, triggering the scenario where your component returns a validation failure result, and the containing system throws, this is virtually indistinguishable from the scenario where you simply just throw in the first place, as part of your error checking, not validation. So, there is no need for you to return some validation result, no need for the caller to check it, no need for the caller to throw.
  5. To put it in simple words, a subsystem's validation failure is a supersystem's internal error.
So, what the above means is that the entire industry is doing it wrong. Nothing but the outermost layer of a system should be performing validation, and that's usually some application-specific integration layer. Subsystems should, at most, and as a convenience, offer free-standing validation facilities which may be utilized by enclosing layers as part of their own validation.

So, for example, the enclosing system might, in the context of its own validation strategy, ensure that every field in a form has been filled-in, and then it might invoke the date-time subsystem's validation mechanism to verify that the value entered in some date-time field is valid, before feeding that value to that same subsystem, or storing it for feeding it to that subsystem later.

That's because only the component which is dealing with the form knows that the information that it receives is coming from the user and therefore needs validation.  Once this information has passed validation and accepted into the system, it should never be re-validated.  Any inconsistency after that point is an internal error of the system, and therefore a hard error.

No comments:

Post a Comment