michael.gr: On Stateless Microservices

This post discusses the stateless microservice design pattern; it is meant as support material for other posts of mine that discuss microservices, mainly michael.gr - The Stateful Microservice.

Is statelessness a requirement for a microservice?

In another post (see michael.gr - So, what is a Microservice, anyway?) I examine what a microservice really is, and I come to the conclusion that from a purely technical standpoint, a working definition could be as simple as this:

A microservice is a scalable and resilient module.

Even if you disagree with the terseness of this definition, and you regard microservices as necessarily more than that, I hope you will at least agree that it is precisely scalability and resilience that statelessness in microservices aims to address, so this definition serves its purpose at least in the context of this series of posts.

There are many who will try to convince you that in order to build a scalable and resilient system, you need statelessness; so much so, that microservices have almost come to be regarded as synonymous with statelessness. This post examines whether this is that in fact so, and what is the cost of doing things this way.

(Useful pre-reading: About these papers)

If we take a step back for a moment and examine the issue from a somewhat distanced point of view, we notice that there is no such thing as a stateless software system. If there was such a thing, it would not be capable of performing any function worth speaking of, and it would necessarily be less useful than a brick, because a brick has physical existence, so you can, at the very least, throw it at someone.

If there is one thing that a software system necessarily has, it is state, so there is no word that is more unsuitable to go with the word "software" than the word "stateless". (By the way, that is also a little something that functional programming aficionados should perhaps take a moment to philosophically ponder about.)

What this all means is that even if you build a system using so-called stateless microservices, that system will still have state; for example, if it is a web shop, it will very conveniently remember me when I come to visit again, and if I order any goods during my visit, it will very inconveniently not forget to send me an invoice. That is all happening due to state, which is stored in the database system of the web shop. So, when people speak of microservices with no state what they actually mean is microservices with no transient state. The state is definitely there, the system just does not rely on any microservice remembering any of it. Each microservice refrains from keeping any state in memory for any longer than it absolutely has to; it always begins the processing of every single transaction by querying the database for all necessary state, and it makes sure to persist any changed state into the database before proclaiming the transaction complete.

Stateless microservices were invented because statelessness is an easy way of achieving scalability and resilience: if a module does not keep any state, then an indefinite number of copies of that module can be created to process requests in parallel; any request arriving at the server farm can be serviced by any instance of that module, and any subset of copies of the module can be destroyed at any moment, without depriving the system from its ability to function.

That's great, but statelessness is not an end in and of itself; it is a means to an end; it is just one way of achieving scalability and resilience. This is proven by the fact that the database systems upon which stateless microservice architectures are built are most certainly not stateless at all, and yet they do somehow manage to be scalable and resilient. Obviously, they are doing something differently.

What is wrong with statelessness?

When building a system which needs to be scalable and resilient, and also needs to be very stateful as a whole, one has to begin with a scalable and resilient data layer as a foundation. Luckily, there exist various commercially available products that accomplish this. On top of that foundation, one has to build their application-specific logic in a way that is also scalable and resilient. Stateless microservices will achieve this, but they are one of the worst performing, and from an engineering standpoint most cowardly ways of achieving scalability and resilience. Choosing the stateless microservices approach is like saying the following:

State is hard; we do not have the slightest clue as to how we can maintain state and at the same time remain scalable and resilient; but look, the creators of our database system are very smart folks, they seem to have figured it all out! So, here is what we will do: we will delegate the entire task of maintaining state to them!

That is how we arrived at the stateless microservice model, which I like to call the "Dory" model, after the fish that suffered from amnesia in the Finding Nemo movie.

In the Dory model, every single incoming transaction gets processed by a microservice that is drawing a complete blank. Upon receiving the request, the microservice starts with very basic questions:

Who am I, and what is this strange place I am running in?
Who are these folks sending me requests, and why?
Should I respond to them, or should I four-oh-three them away?
Let's start by authenticating them...

...and it goes on like that. For every single request, there are multiple round-trips to the database while the microservice is discovering more and more about what it is being requested to do and whether it should in fact do it, and even more round-trips to retrieve the information that will go into the response, including very basic information that hardly ever changes, such as the name of the visitor on whose behalf the request was sent, and in multi-tenancy scenarios even the name of the tenant on whose behalf the website is being served.

When the transaction is nearing completion, the stateless microservice will meticulously store every single little piece of changed state in its exact right place in the database, as if it is making notes to itself, lest it forgets.

Finally, once the transaction is completed, the microservice will proceed to deliberately forget absolutely everything that it learned during the processing of the transaction, before it starts to wait for the next transaction.

I am not going to say that this is preposterously inefficient, but it is preposterously inefficient.

Incidentally, the magnificent inefficiency of stateless microservices makes them to a certain extent a self-serving paradigm: in order to scale up you might think you need them, but once you have them, they will perform so badly, that boy oh boy, are you going to need to scale up!

Another problem with stateless microservices is that they cannot take any initiative of their own, they are restricted to only responding to incoming requests. This poses a problem with server-initiated client updates, which in certain circles are known as "push notifications". A server-initiated client update happens when the server decides to send some data to the client at an arbitrary moment in time, as a result of some event occurring on the server, without the client first having to request that data.

Actually, the very term "push notification" seems to have originated from system designs in which such sending of data is a difficult task, as if the developers have to put their shoulders against the notification and push it all together to make it straddle the great divide between the server and the client. In other designs, where asynchronous bi-directional communication is the default mode of operation, there is no need for such laboriousness; server-initiated client updates are just part of the normal way things work. Alas, you cannot have that with stateless microservices, because bi-directional communication requires the notion of a session, which in turn implies a notion of state, which is a no-no.

Consequently, software systems that utilize the stateless microservice design pattern address the problem of server-initiated client updates in various wacky hacky ways:

Some opt to not have any; if the user wants to see what has changed, let them refresh the page. This can cause serious problems in systems where multiple clients may edit the same data, since the system has no way of alerting a client that the data they are editing is also being edited by another client at the same time.
Some use polling, meaning that each client keeps sending requests to the server at regular intervals asking whether there are any updates. This is wasteful, because each of these requests represents work that needs to be done, but very few of them will result in anything useful happening. At the same time, in order to reduce server load, the polling cannot be too frequent, which in turn means that there will always be a time lapse between the moment that an event occurs on the server and the moment that the clients take notice.
Some opt to have special stateful modules working side by side with the stateless microservices to handle the push notifications in a completely separate way, under the assumption that notifications are a kind of optional, "nice to have" thing anyway, so if performance suffers due to lack of scaling, or if service is interrupted due to lack of resilience, it will not hurt too much. On top of being clunky, this approach is also short-sighted because from the entire broad topic of server-initiated client updates it only considers the narrow case of updates being used for the sole purpose of on-screen notifications.

Further reading: michael.gr - The Stateful Microservice

Cover image: Dory, the yellow-blue fish (a Royal Blue Tang) that suffered from amnesia in the 2003 movie Finding Nemo by Pixar.

2021-10-14

On Stateless Microservices

Is statelessness a requirement for a microservice?

What is wrong with statelessness?

No comments:

Post a Comment