Microservices with Layers
by Jeffrey Vroom
StrataCode offers a new strategy for managing your code more efficiently. It reduces the upfront design load on the architect and improves options for incremental updates to scale systems management.
Microservices has become a hot design focus for architects, particularly in recent years. This is no doubt in reaction to the time we've felt has been wasted in managing monolithic code bases. But we'll only really save time, if we wisely choose the boundaries between pieces. Due to additional complexity of implementing and managing remote interfaces between processes throughout the software lifecycle, it also seems important that we consider patterns that support "design evolution" - where we can start monolithic and gracefully break out parts of our system at the code level, or RPC level as the system evolves.
Programmers use the term "Microservices" to refer to the pattern of software development which emphasises deploying smaller services which communicate with each other using web service APIs (RPC or messaging based), instead of deploying larger services which use internal APIs (i.e. direct procedure calls which require coupling of the code of both interfaces).
There are lots of advantages to smaller services: smaller teams, separate source control repositories, more agile, independently scalable but some pitfalls when you apply this approach: more error checking, runtime overhead, coordination and availability requirements across services, and coordination across teams when interfaces change.
The architect of a large system balances many factors to make the right decision and is faced with high costs when a change is required.
Are the databases of the services easy to decouple, or would two fields benefit from being updated together with transactional integrity? What redundant data will need to be created and how often will it need to be updated? How will we manage integrity or mediate conflicts that might arise?
How easy will it be to test the API in an automated way so we can reliably support it without running all of the code that uses it? Will the services API need to change in an incompatible way? Are two candidate services destined to upgrade in lock-step more often than not due to feature overlap? Do we really need a remote service boundary between these services, or is a normal code-module boundary we can use with separate source-trees to provide a more efficient application with the same team separation? Will the API boundary be used too often for efficient RPC? Will there be too much data copying? How will failures in one service be handled in the other?
In support of Microservices, consider how much benefit of scale will be achieved by having independent development streams of the two services - i.e. separated in such a manner that they could be operated at arm's length from each other. For some applications, even though there's more work to build them as separate pieces, it's an important design boundary that should be enforced for maximum agility. Like separating a payment service which has different integrations and SLAs.
But if you split tightly coupled logic across services, you will spend more time managing them as separate services than as one. Splitting a database into two pieces without a good reason could just add more overhead for each insert, update and more potential data integrity problems.
These are not easy decisions, leading to a design challenge to pick the right pieces and design the interfaces carefully. It's more difficult when you are starting a new project and can't answer these questions with actual domain model experience. In that case, it would be fastest to just write code so we get it working quickly, knowing we have an easy way to repackage later based on experience. We could start out with services based on different classes and then easily run them as separate services with a seamless transition. We could also split the git repository into pieces based on how that module was evolving or who needed to manage it. We could do it all as a change to an internal project, splitting one published layer into a few layer parts, all without breaking any upstream dependencies or changing API contracts. When you learn to think of code in terms of layers, sharing types across prosses boundaries, and augmenting RPC with 'synchronization', you get these benefits in how you build and evolve systems.
Evolutionary code management
I'm the kind of programmer, that when starting on a new project, wants to deliver 'working code' for the core business-domain model as quickly as possible. Let the stakeholders assess and improve it when it's simple and still in the rapid development stage.
I'm also the kind of programmer that never wants to interrupt the incremental flow of business driven changes for an 'architectural rewrite'.
So let me show you how StrataCode has helped me reconcile these two demands in myself.
The prototype starts out in a single git repo, running in a single process, maybe using the file system for save/restore just to get things going. Once we have solidified the domain model somewhat, we'll build out a database schema and storage strategy and start the automated tests. If there's some obvious service that should be decoupled, we'll move that code into a new set of layers that we can optionally configure to run in a separate process.
When StrataCode builds these two processes now, it detects all of the methods which cross process boundaries and injects a "remoteMethod" call, or it may detect new errors for calls to remote methods with objects that are not serializable or the runtime does not support synchronous remote methods. Once we've cleaned those errors up, the code should just work without changes. And we have a quick way to inspect the service API and adjust the code to clean up the protocol. We'll run the unit tests both in local and remote mode, so we can detect when there's a problem caused by the remote api.
Eventually we add more developers and create a separate team for the remote service. Now it's just a matter of moving those layers into the new git repo and including that in the list of repos required for the application. When programmers do a new sync of the application, they are in sync automatically.
Down the road, when major feature changes are required - perhaps an incompatible domain model change is required, you can use layers to isolate the deprecated types, properties, and methods. Again, no contracts change but now you can separately build the compatible and incomaptible versions. Clients can opt-in to the new version. Static typing lets you discover any soon to be invalid references to the old version quickly. Code processing could even be used to find and alter references to code that refers to incompatible features automatically in some cases. Once the migration is complete, you can dispose of the deprecated layer and any tests that refer to it.
More freedom with layers
Layers give you this freedom, but to use them effectively requires thinking in a new way. Fortunately, you can start with what you know - use one layer for each module. Only add more layers when dependencies get more complicated or when a usage pattern changes that requires refactoring. In the early stages of development, it's fastest to build things in as few pieces as possible - e.g. one file per class, or even everything in one file. As it evolves split pieces out to avoid any one piece from getting too large. Keep a piece of logic together if it implements a logical unit - things that belong together because they are created or updated at the same time, and/or because the code is likely to be changed at the same time. Properties stay near the methods which manipulate them for example. Parts of the code which have the same dependencies will also be near each other in the file in this approach when different parts of the same type have different structural dependencies (e.g. UI and database).
When one file is getting too large, split it into pieces that helps improve efficiency, readability, reusability etc. Split off a layer to reduce conflicts between developers in source control, or to evolve a specific part of a class using a different git repo with different security policy (e.g. license keys, etc.)
When a microservice is needed
When a remote service boundary in the code is needed, review the APIs and make sure they are easily remoteable. This means the argument types can be serialized or are already synchronized. After that's done, reconfigure the layers so they are split into different processes. Now your application stack of layers is separated into two overlapping runtime layer stacks. Since StrataCode has both stacks at compile time, it can detect the remote method calls - those that cross the process boundary by calling a method that only lives in the other stack (or is marked via an annotation to be run in a specific process). It modifies the code appropriate based on the frameworks used. For Java to Java, this turns into a method call like: DynUtil.invokeRemoteMethod(...). For JS, your remote method calls must be in a data binding expression, so they can be run asynchronously. As part of the synchronization protocol, any arguments referenced in the remote call are lazily serialized before the call itself. If the object was received from the remote side or already sent, it's sent by reference using it's id. For basic uses cases, domain model code can be independent of the remote procedure boundary. For more complex use cases (e.g. flexible paging of list oriented data structures), we can improve the declarative sync model, or you can fall back to using RPC when a custom pattern is actually needed..
For example, many applications have a "User" which stores data about a user in the system. It would be logical to split the database that stores the user's profile data (i.e. address, name, etc.) from the behavioral history data (e.g. the last time they logged in, the last pages they visited, etc.). Some of the code in the User object is used in both places - at least the user id and login name properties. This one User class file can split into three layers: user.core (the shared parts), user.profile (the database storage for the id, name, address, etc), and user.history (the database storage for the id, login name, and history data). When the system combines all three in one process, it produces the same code as the original monolith. But when core and profile are combined, it forms a separate DB storage for just that information and when core and history are combined, it stores the other parts.
If the two need to communicate using APIs, we can detect that it's a remote method call (i.e. it's calling a method that's part of the same bundle of layers, but not in the same process) and hide that notion from the program if the client can use synchronous remote calls. If not, they can put the remote method call in a data binding expression, or code it up to process the results in a callback. Either way, we can still run that code using a local method call so we can still provide an incremental way to run both configurations.
To deploy microservices, it's common to have a central dispatcher service which delegates the client request to the appropriate service. StrataCode manages the entire system at build time and so can join together the entire process connectivity diagram at build time, to generate the new dispatcher config.
To break up a large code base deployed as a monolithic service, the first step is to refactor and that can be a large task that takes a long time to complete. It's typical that the original code base continues to need new features and maintenance so the refactored code base is chasing a moving target. It might take a lot longer than it should due to the churn this situation causes.
With StrataCode's robust, type-safe splitting and merging the initial refactoring does not have to change APIs or the end result. It can be deployed as the monolith right away as it's possible to ensure the generated result matches the original. This refactored code base then is available for new features while the same layers are reassembled into separate services as well. Both versions can run side-by-side until the new version is demonstrably better in every way. The old monolithic deployment model can stick around as a short-cut for developers since it's built from the same code.