Articles

StrataCode: layered code for evolutionary modularization

by Jeffrey Vroom

Summary

Introducing a new language for type-safe layers of code where monoliths evolve gradually into microservices and everything is customizable without scaffolding.

Intro

The main idea behind this project formed almost thirty years ago. I had just spent two years building an object-oriented visual programming environment called AVS/Express. It was successful enough that customers used it to build large applications with many modules and connections. But despite my best intentions, it was difficult to reuse the parts of these large object graphs.

To customize even a small feature of an application required making a copy of the code or refactoring. I thought object-oriented inheritance would be enough, but it wasn't.¹ After a lot of thought, I realized layered inheritance was needed - the ability to modify a given type in a different slice, or layer. The layer modifies the original, instead of copying to create the customized version. This led to the configuration layers feature in ATG Dynamo, and experience with that led to this long-term effort to answer the question of whether a layer-oriented programming language can improve scalability, and customizability of software.

One can always work around the limitations in a platform, but I feel the great potential for StrataCode and layers to help developers architect more scalable and customizable platforms. To provide more context on why I think that way, I'll present two challenges programmers face building systems then show how coding with layers changes the game.

First challenge: monoliths and microservices

To help understand the trajectory of monolithic development, let me refer to the recent article from Shopify's architecture team on how they manage their 2.8M line monolithic service. There's no question the monolith has been successful for Shopify, but they are seeking more agility by creating more modular, reusable, separable components. They've made progress and are on a good path, but it is not easy because the changes required break compatibility across a large volume of code.

It's common for a monolith to be the best design in the early stages, but as the system grows, it scales better with more separation. Large systems benefit from more separable components or partitioning into separate services with well-defined API boundaries. This also permits more customized downstream deployments and improves operational scale and efficiency. But there's never a great time to stop everything and perform major refactoring to make the changes. If developers try to predict future modularization needs upfront, they can over-modularize. They might make the mistake of splitting even fine-grained, tightly-coupled features into separate microservices. Or even if the services are well designed, they spend design and development time only to create more operational complexity and runtime overhead that does not pay off until much later.

Here's a hypothetical diagram to illustrate this point. The monolith starts out with an advantage in the rate of feature development, but some time in the future might arrive at the same feature set as a microservices version but with a slower rate going forward:

Second challenge: designing for customization

Before describing how layers help with this tradeoff, let me describe another tendency that I've seen that contributes to friction during software development. It's difficult for programmers to accurately predict and design hooks for future customization by downstream developers, testing, and operations.¹ This includes choosing the properties that are configurable, the components that can be extended and designing the interfaces for plug-ins, and callbacks. How many of us have had to copy an XML, JSON, or YAML configuration file with so many values that never change? Or struggled to debug problems with an over-designed inversion of control component configuration? Or needed to implement a feature that could not be plugged in easily to a 3rd party component? It's all too easy for programmers to over-design or miss an important feature when designing for customization.

The solution: layers for evolutionary modularization

Layers help address both of these challenges. First, for the tension between monoliths and microservices, they allow developers to remodularize an existing system in place, without changing API contracts. Build both the monolith and microservice configurations from the same source code. They enable what I'll call evolutionary modularization, where the system keeps the agility of the monolith at first, and allows it to seamlessly evolve into the best modular structure for any scale.

Code can move from file to file, layer to layer, bundle to bundle, service to service, mostly through cut and paste and without changing APIs and disrupting the downstream code. Many developers already do this to some degree. Layers are the missing piece that support remodularizing large systems in-place.

With evolutionary modularization, the developer might prototype the design in one file, perhaps an HTML template that's mostly declarative but includes the domain model as inner types. At this stage, it's easy to change, share and collaborate. Give a copy to the product manager who can make changes quickly to refine the business domain model just like you would a spreadsheet. When the project outgrows one file, the inner classes are moved to separate files where they can be reused.

Annotations are added to customize persistence and expose APIs. When the directory becomes too large, move files into sub-directories. When one aspect of the code needs to be used in an independent context, it is moved into a new layer. Initially, the layer is like a module but with some extras like default annotations and imports.

With layers, all of these changes are transparent for the code in the downstream ecosystem since the published API contracts don't change at any step.

Splitting a monolithic service with layers

As the project grows, at some point the monolithic service might need to be split.

A Joshua Tree by Bernard Gagnon, (license), cropped from (Wikimedia Commons)

Let's say validation rules and methods also need to run in the browser for usability, or statistics gathering methods need their own process for performance reasons. At that point, those parts of the classes needed by both sides are moved into a new shared layer. The shared layers implicitly define the set of remote types. The set of server methods called from the shared layers help determine remote API. Validation and other code in the shared layers runs on both sides, in two different versions of the same class.

The split into two processes with layers is not entirely seamless. Some async remote method calls might need to be moved into a data binding expression. For security or performance, new annotations can be added. Some code on the new process boundary might need to change to make a reasonable remote API. But these changes can be incremental as both the client/server and monolithic versions are built from the same code. That makes for an easier switchover and preserves the development agility of the monolith even after the switch.

Layers for universal customizations

To solve the second problem, layers free developers from the burden of designing and coding customization points. Instead they choose to publish features as needed for downstream customization using layered code. This supports easy code that provides more flexibility down the road.

The type system and IDE track customizations like regular code, guiding both the original product developer and the downstream developer that needs to customize the product. There's one customizable name space for types and instances and no need for a separate inversion of control, configuration, or plug-in framework.

Layers - the missing organizational tool for code

There are a number of other benefits to layers as a project evolves. They also allow systems to adapt as frameworks and other dependencies change. Code with a particular dependency is separated into a new layer with cut and paste. Since type names don't change, existing contracts are preserved. They offer the ability to split features out into separate layers to improve marketability and packaging without redesign.

The challenge in applying layers effectively is understanding the basic design principles, and how to adjust the layers of code as the system evolves. In the merged view, a type exposes the same set of properties and methods. In the layer view, a type's supported set of properties and methods grows as each layer adds to it. This allows aspects of a type to be separated, along with the dependencies injected by that aspect.

Although there's a conceptual leap for the architect managing it all, it's not that different from what we do today for reusing code with plain object-oriented inheritance. I'm hoping that there are at least a few out there who will understand why this is a game-changer.

The big book of software patterns

For these reasons, I believe layers belong in software's version of the big book of essential patterns (borrowing the term Paul Erdős coined for math proofs). To me this book also contains these principles:

Code readability and debuggability are top considerations when coding
Code paths should be traceable at edit time (e.g. static typing, find usages) and runtime (e.g. clean stack traces, easy breakpoints, and logging).
Keep application code separable from framework code
Use annotations, components, properties and data binding expressions for a declarative skeleton, glued together by regular code.
Keep functional/declarative expressions readable by mortal programmers. Don't embed a rubix cube in the code for the sake of saving a few lines of code.

Using these principles and experience with frameworks at AVS, ATG, and Adobe, I have validated this approach with layer-aware frameworks for reactive, declarative apps with the goal of building more scalable platforms.

Project status

It has not been an easy road, but I feel lucky that I've had the time and energy to make a tool that feels flexible and useful with steadily improving quality and performance. I don't have the desire to release something with lots of bugs, but it's hard to do so without more developers. I like the idea of open sourcing the parts where there's sufficient interest.

Layered extensions to Java and the code-processor have now been tested in the IntelliJ plugin, the web framework, a Java to JS converter, data-sync, database integration, and more.

There's a dynamic runtime for fast round trips, build files and test scripts.

The site builder is an early version of a complete app development framework with blog and product plugins built in around 15K lines and 32 layers. Initially built as server only (sending HTML changes over the wire to a small fixed JS client) later split to also support client/server mode. See the [demo videos](/videos.html) for a code walk-through of these concepts using IntelliJ.

The program editor is an early version of a live-programming tool for viewing and editing layers, types, properties, binding-expressions and instances complete with swing and browser code editing.

See the todo list and the rest of the examples and documentation.

StrataCode is open source. See the status page for up-to-date information on the status of each major feature.

Read more articles about StrataCode from the menu, or Download or signup for updates.

^{1. To refine a type with inheritance requires creating a new name for the subtype. During refactoring, this means references to instances need to use one or the other type name. Then it's possible casts have to be added to the new subtype. Sometimes, this goes smoothly but in complex situations it's a mess and a poor way to implement customization. Do a quick look around your code base and see how much code touches the inversion of control framework, or is actually scaffolding to support customizations.
↩}
^{2. When customizing a code base directly, without downstream developers, the feature flag is an easy and obvious solution. Because they only work when customizations are built-in to the original code, they can be easy to maintain as long as there are not too many. To use them effectively, it's important to ensure all code-paths for a given feature are easily traceable (i.e. support a "find usages" tool that shows all affected code) and that too many feature flags don't overlap and pile up in the same function making a crazy maze of if statements↩}