The topic of software architecture has become a bit cringe. Some people will roll their eyes at the mere mention of it.

My impression is this is because it has often been a very top down practice. An architecture committee approval must be secured before starting anything new. Nothing gets done unless it’s been vetted by “the architect”. The people who are closest to the problem being solved must seek permission and convince people who are furthest removed from it that their solution is best. There are many flaws to this approach to solving problems and you’ll find many opinions online on how it is an anti-pattern to have architects. Not the least that there is no such thing as a best solution, it’s trade-offs all the way down.

This architect archetype is probably best illustrated in the Matrix as a man with a sense of superiority and a white beard.

Although architects get a bad rap, I do think architecture is an important aspect of software engineering and the better question is to figure out how to get to an architecture that enables different teams to operate more independently.

In this post, I’m discussing what software architecture even is and offer some thoughts on a bottom up approach.

Software “Architecture”?

It’s funny to think of software “architecture”. In many ways software architecture is antithetical to regular architecture. Software is fungible in ways that a building or a bridge is not. Hopefully, you won’t change too many things once you start construction while software architecture is often intended to maximize the ability to change things as we progress.

First we should distinguish architecture itself and its purpose from the process of establishing one. It all starts with good intentions. As an organization grows, the complexity of software systems invariably increases to a point when nobody can have a clear picture of how everything works. We end up in a situation where every time you change something, you break something else. This often leads to calcification. People try not to touch things that are depended upon. In turn, this creates a vicious cycle that incentivizes adding more to the pile of increasing complexity and impedes any effort to simplify things. We are all limited by our ability to keep in our head the various components of the broader system and who works on them. Opinions on the ideal team size vary, but generally it is in the order of a single digit. These fundamental limitations of human nature drive the need to decompose in (hopefully) loosely coupled components and teams.

Decomposing systems into manageable units with tolerable complexity is the problem architecture is meant to solve. The whole point is to enable teams to move as independently as possible and iterate quickly while minimizing the need for coordination. Conway’s law states that we will inevitably ship an architecture that follows the lines of communication of our organization. Simply because it is easier to discuss within a single team than across multiple of them, we will naturally minimize changes that involve cross-team communication. It is often seen as a negative side-effect of an organization’s structure but it is not so much a fatality than an intrinsic property of the system by construction. Architecture and organization go hand in hand because architecture emerges to enable the org to function as designed.
When we talk about an elegant, loosely coupled design, the goal is to enable teams to make loosely coupled decisions. One should be able to change how things work in different areas of the system without reworking everything nor needing to discuss with many people to figure out who is impacted by those changes.

Architecture does not have to be top down

To simplify the top down approach, we can summarize the trade-off as follows: the main benefit is strict alignment on goals (assuming good communication), the main drawback is that decision making is far removed from those who understand details and nuances best. Teams executing on a plan will be held accountable for the results but are not empowered to make a lot of the decisions to best achieve them. This has negative side effects, like a lack of sense of ownership that causes bad ideas to be discovered late in the process. Centralized decision making also quickly becomes a bottleneck in a larger organization.

A bottom up approach where teams are empowered to make decisions does a better job at aligning accountability towards reaching goals and agency to decide how to best reach the goal.
Now, if the bottom up approach is better from this perspective, it is also a trade-off. When decisions are made by the people who best understand the systems - and who also will be responsible for the consequences of those decisions, creating a more virtuous cycle of incentives - there are drawbacks that result directly from the decentralization of decision making. If you just leave every team to their own devices to independently make decisions without coordination, they are unlikely to just naturally all reach the same conclusion on what problems we’re solving or who is solving what part. There is going to be a level of chaos that needs to be managed.

Naturally various components of a wider system have dependencies and decisions on each of them can not be made in a vacuum. Changing something will often impact other teams whether we are aware of it or not. Giving agency to the people who are going to be accountable for executing on those decisions, introduces the expectation to consider the impact of those decisions on other teams. Conversely they will need to be aware of the impact of others’ decisions on their own work.

To support this need for clarifying dependencies, there are two main types of documents that typically underpin an effort to define an architecture rather than just cobbling things together. First, documenting current architecture. It can take various forms and focuses on explaining how components depend on each other, why and who owns them. When we want to modify how these components interact with each other, we invariably have to talk to the people behind them. This current view needs to be maintained over time and benefits from being short and to the point. Second, design documents that focus on making decisions related to changing the system. For those, it is important to clearly define the problem being solved before discussing solutions… unless we like arguing in circles on solutions that solve different problems.

Since software architecture is very different from regular architecture, we don’t actually need a role that centralizes drawing exhaustive and precise plans to be followed closely. We do need people facilitating alignment amongst teams to manage and limit the increase of complexity caused by decentralized decision making. Whether you call these people software architects or some other senior engineering title doesn’t really matter.
An architecture review process should bear a lot of similarities with the Socratic method: asking questions about the problem being solved, identifying the characteristics of a good solution and deciding whether the proposed solution actually reaches the mark. The goal is not to tell people what to do but to help them reach good conclusions independently and make sure they think things through.
Instead of making the most senior people a bottleneck by focusing their effort on verifying every solution, they can focus on making sure the people closest to the problem responsible for the decisions have access to the right information. Building trust becomes a force multiplier that enables more decentralized and parallel decision making.
White beard optional.

Enabling bottom up architecture

To enable people to make good decisions, we need to give them the right context. This context will come from various roles with various responsibilities, whether it’s product managers, technical leadership or their peers on other teams. Since we, as humans, can not just read every design doc nor absorb everybody’s feedback, we’ll need a mechanism to route the right information to the right people. The organization can provide clear ownership of components and identified representatives for the teams to help achieve this.

Requesting a review for a design document is intended to facilitate the collection of this context. By describing clearly the problem being solved and the solutions considered you create the opportunity for reconciling the inconsistencies of distributed decision making. When creating such a design document, we need to identify a set of reviewers whose feedback is important because they either have information about this type of problem or they’re going to be impacted by the solution. The number of reviewers has to be limited because nobody has the bandwidth to review all designs nor to digest everybody’s feedback. Just like the ideal team size, the number of reviewers falls in the single digit to stay manageable. It is also important to limit the time for review so that everybody can make progress. In other words, the latency induced by the reconciliation process must be kept under control.

The main difficulty and success factor of such a review process is figuring out who the relevant reviewers are. In a small company, everybody can be aware of everything that is happening at some level. As a company grows this is no longer the case and we need to understand what areas are relevant and identify representatives for those areas. Besides writing code, growing as an engineer means taking responsibility beyond the deliverables of your own team. You need to ensure you are not just optimizing something locally to the detriment of your organization’s common goal. To use a lean manufacturing example: if one team of the car factory figures out how to produce door handles much faster, it is wasteful and counter productive to accumulate door handles while other parts of the production chain can’t keep up. You need to look at the system as a whole. Just optimizing its parts independently does not necessarily lead to good outcomes.

Systems are made of components and system diagrams document the dependencies between them, whether it’s as a library or a service call. These components are maintained by people who will need to be contacted when changes are required. Organizational structure and architecture go hand in hand because software engineering is a social activity. The social graph within an organization should reflect the dependencies between the teams. You need to know and trust your neighbors so that you can help each other.

Nurturing an engineering community where people build relationships with the maintainers of the components they depend on is extremely important to a bottom up culture. With time, the role of software engineer increasingly includes more glue work to ensure that we understand our dependencies and can collect feedback from the right people without creating a communication overflow.

If we’re doing things right, everybody becomes comfortable giving and receiving constructive feedback. Reviewers of design docs feel responsible for helping the author be successful and avoid drive-by comments. Teams are aware of the impact of their actions across the organization and strive to help each other reach a more global optimum and don’t stay isolated in their local bubble.

Since we can not know everyone, we do need representatives for various areas of the system. The architecture documentation is a map to navigate team dependencies to be included in design reviews. It clearly identifies ownership of components, contracts and dependencies.

The main product of architecture becomes a dependency graph between teams with a hierarchical structure that enables decision-making at different granularity levels somewhat along the lines of the organization hierarchy. Evolving the architecture aims at improving this dependency graph so that teams can operate independently as much as possible.
In order to achieve this, we all need to be aware of these dependencies and to consciously make decisions related to the architecture and our organization that are not going to hinder teams’ ability to make their own decisions and execute in the future.
Designing the organization and its software needs to be a collaborative exercise between several roles from managers to engineers as they are intimately intertwined. As an organization grows, its structure needs to evolve incrementally hand in hand with the evolution of the architecture.

Thank you Tanya Reilly and Ross Turk for the feedback.