Contribution 18

From WikiContent

Revision as of 16:12, 30 June 2008 by KeithBraithwaite (Talk | contribs)
Jump to: navigation, search

There Can be More than One

We urge programmers: "Don't Repeat Yourself". We seek to express every idea "Once and Only Once". This kind of thinking translates into architecture in the form of the "Enterprise" this, that and the other. The Enterprise Data Model, say, or the Enterprise Message Bus. These carry a connotation of being a single solution used for all systems within an organization. Which has its benefits. And its costs. And like all solutions, creates its own problems.

If all the facets of an organization's IT world had the same usage patterns, the same latency bounds, the same real-time constraints and all the other characteristics that architecture addresses, then the cost of using the Enterprise solution (there always is a cost) would be easy to justify. But they don't.

It's certainly more convenient for us as technologists to build systems if there is only one: operating system, platform, programming language, message format, message bus, data model, data base, component container, component framework, GUI toolkit, web toolkit, web browser, etc etc etc. That world is clean, efficient, and wholly imaginary. Striving to get into it often does more harm than good to the paying customer.

Axiom: by the time a business is large enough for someone to start thinking about the need for an Enterprise solution then it will be composed of the integrated systems of several businesses. There will be at lest one of other businesses acquired in the past, or the same business as it was in the past. Either way, there will be some duplication somewhere. We abhor duplication. But why?

We've learned to shun duplication of representation, of data, of logic, because it leads to well-known classes of failure. For example, we normalize out duplication in relational database schemas to make certain classes of update anomaly impossible. But then we de–normalize read–only views for performance. And the limit of that is the data warehouse—a giant pile of denormalized, duplication inconsistent with it's source data stores. And no trouble at all because of that because it never receives ad-hoc updates. This duplication and inconsistency is safe because the rates of change of data, and distribution of changes over time, are so very different.

Let's say that our team maintains lots of little websites for lots of different customers. Let's even say that they are all implemented with the same stack. One day a new requirement comes along that is prohibitively expensive to meet with that stack. But the customer is a valuable one. What to do? Could we bear to implement just this one site using a new tool with faster turnaround? Maybe we can. But, says our software engineering conscience, what if...what if we have to change one of the old ones and we forget to update the new one in the same way? Or vice–versa? What if, what if? Well, what does the data tell us? How often do those old sites change? Oh, between once every two years and never. Well then! And since they all have comprehensive automated functional tests (right? right) we'll know if we broke them, so off we go.

The default stance should be towards avoiding duplication, avoiding inconsistency, we know the pain too well. But never forget to stop an think, and recall that down in the bottom of the big of tricks is multiplicity of solutions. Dismiss it 99 time out of 100, but dismiss it actively, not by reflex. And that one time it will save your project.

By Keith Braithwaite

This work is licensed under a Creative Commons Attribution 3

Back to 97 Things Every Software Architect Should Know home page

Personal tools