A Blog about Business, Enterprise Architecture and Software Design by the Founders of ZenModeler

Simple and Easy Software Design – QCon London 2012

The talks I remember the most on QCon‘s second day are those from Rich Hickey. I was lucky enough to attend both the keynote in the morning (slides available) and the one about the modeling process (slides also available at the GotoCon website) in the evening at the London Java Community Group.

I was pretty amazed by the clarity of Rich’s thoughts in the Clojure documentation but I found a real visionary with the datomic service disclosure a few days ago. He’s a sort of philosopher, who tries to go very deep at first to grasp the essence of concepts like state, identity or time and then applies his thoughts to technology and I’m fond of this way of thinking and doing.

I already wrote about state, identity and behavior a few weeks ago. Here is a more general way to write about simplicity and is also the subject of Rich’s keynote at QCon : « Simple made easy ». Simplicity is a subject that fascinates me, as I see it as one of the ultimate goal in my design activities. In the following sections I will humbly mix my thoughts on that subject (sentences starting with « I ») with the ones from Rich (sentences starting with « Rich »).

Simple and Easy

Rich started by talking about the core and boundaries of these two words:

  • Simple : one fold, which is not composed
    • From the latin simplicem, "without a fold"
    • Is the opposite of complex
    • It is an objective concept that is independant of the human observe
  • Easy : lie near

Complication or complexity?

As a first step to grasp the concept of « simplicity » I started to look at what the opposite, complexity, is. So I separate at first complication and complexity:

  • Complication:
    • Deterministic, has a lots of details and connections, but with linear « cause and effects »
  • Complexity:

Finally, even though it’s interesting from an intellectual point of view to separate complication and complexity I think we can just use the following definition:

Complexity is a measurement of how difficult it is for someone to understand how a system will behave or to predict the consequences of changing it Therefore the latter definition is relative to the subject trying to understand a system, so I think that what is pertinent to analyze here is the subject’s perception of complexity.

What makes a system difficult to understand?

High level of variety in its components

Low variety in the system vs High variety in the system's constituents

Lack of order

Same relations between the elements but some organization (partitioning) added on the left

Same relationship between the elements but some organization (partitioning) was added to the left Same relations between the elements but organization (grouping) added

High degree of connectivity

Linux apache system calls for serving an image

Windows IIS system calls for serving an image

Which one do you prefer to work with? :-) But is simplicity only a lack of complication and complexity? It’s a good start (high order, low connectivity and low variety) but I think there is something more to it: *simplicity is also defined by what you add with clarity, purpose, and intentionality *(look at the DDD intention-revealing interface pattern). I think the most difficult part of simplicity lies in the latter.

Complexity and Simplicity toolkit

Rich’s morning talk was a praise to Clojure and a line of arguments about why Clojure is better on simplicity and easiness. Rich reborns an old english verb « complect » : « To join by weaving », and compare complexity and simplicity constructs to what they complect. Actually, it was a confrontation of the OOP/imperative way of thinking versus the functional one with the exposure of the different concepts on each side :

Complexity tool kit

Construct complects Elements
State complects everything that touch it
Objects complects State, Identity, Value, Operations
Methods complects Function and state, namespace
Syntax complects Meaning (derives meaning), order
Inheritance complects Types
Switch/matching complects Multiple who/what pairs
Variables complects Value and Time
Imperative loops, fold complects What and How
Actors complects What and Who
ORM complects OMG
Conditionals complects Why and the rest of the program

Simplicity toolkit

Construct Get it via
Values final, persistent collections
Functions a.k.a stateless methods
Namespaces language support
Data Maps, arrays, sets, XML, JSON etc.
Polymorphism a la carte Protocols, type classes
Managed refs Clojure/Haskell refs
Set functions Libraries
Queues Libraries
Declarative data manipulation SQL/LINQ/Datalog
Rules Libraries, Prolog
Consistency Transactions, values

It appears that the Clojure concepts have all the simplicity labels :-). Nevertheless it’s a good kick in the ass of the OOP approach.

What brings simplicity in our programs?

  • Composition of simple components together
    • This is really the fundamental way to have a simple design in a direct application of the Single Responsibility Principle.
    • In tune with the Unix philosophy that composes small programs (cat, grep, sed, awk) and link them together through text interface (the pipe). Actually, I think each point of the Unix philosophy helps bringing simplicity in every design.
  • Modularity by partitioning or grouping
    • But be careful, grouping and partitioning are enabled by simplicity, not by the opposite.
  • State always bring complexity

Example of incidental complexity with the « order » concept

Then Rich talks about the « order » concept, which gives a concrete example of one that brings incidental complexity and infects a lots of our usual programming constructs without any usefulness. Why « order » brings incidental complexity? because modifications are inhibited: a change in the order impacts the usage others have on the collections.Hence we find the « order » concept in the following constructs on the left while the construct on the right side fulfill the same needs but without that over-added « order » concept.

Order in the Wild

Complex Simple
Positional arguments Named arguments or map
Syntax Data
Product types Associative records
Imperative programs Declarative programs
Prolog Datalog
Call chains Queues
XML JSON, Clojure literals
... ...

Conclusion

Why take all this times to write about simplicity in what can be seen as very philosophical? because simplicity is a key concept and a key skill to design our softwares, not only in their technological aspects but very firstly in their domains. Yet the most significant complexity of many applications is not technical. It is in the domain itself, the activity or business of the user. When this domain complexity is not dealt with in the design, it won’t matter that the infrastructural technology is well-conceived. A successful design must systematically deal with this central aspect of the software Eric Evans, author of Domain Driven Design. « Simplicity » and « Common Sense over Common practice » was also the corrolary of the « test considered harmful » and « ORM OMG ! » that were higlighted several times during the conference. What I like the most in QCon is the « think differently » that I could feel in some talks given by opinionated people. And after that exposure I started looking at things and thinking differently, so thank you Rich for having given me that! Other write-up about that second day at QCon :

Know The Trade-offs Of Your Design Decisions – QCon London 2012

Trade-off is the keyword I remember the most of that first QCon London 2012 day. It was a key point of Dan North’s presentation yesterday: each decision imply a trade-off, wether you know it or not.

QCon Logo

Trade-off means that there is no black and white choice, you always have a trade-off, the point is to make an informed decision not one influenced by emotion or fashion.

Dan talks about 4 topics and the usual decisions that goes along:

  • Team composition: co-located or distributed ? feature or layer team ? experienced or inexperienced ? small or big ?
  • Development style: automated or manual build ? test-first ? TDD ?
  • Architecture: monolith or components ? resources or messages ? synchronous or asynchronous ? single event loop or multiple threads ?
  • Deployment: automated or manual deployment ? vertical or horizontal scaling ? hosted or in-house ? bespoke or commodity ?

Common Sense is better than Common Practice

Dan North.

This is a quote when he talks about times when tests are the worst thing to do given the trade-offs: longer time-to-market, disposable application, low return on investiment, lower agility for change, etc.

Design is all about the decisions you made

All your architecture is about the informed decision you made. Your decision can have a system wide impact in the quality attributes of the system or a localized one, that define respectively macro (cross-system) and micro (system- internal) architecture as Stephan Tilkov stated it in his presentation. Design is all about the macro and micro level decisions you made.

Stephan also talks about the objectives and rules that goes with each architecture. Here are some topics that can have rules associated with, each one categorized as macro or micro:

Cross-System System-Internal
Responsibilities Programming languages
UI Integration Development tools
Communication protocols Frameworks
Data formats Process / Workflow controls
Redundant data Persistence
BI Interface Design Patterns
... ...

To take account of time, each set of rules is versionned. With the assumption that two systems in the same Rules version can interoperate seamlessly.

In the beginning of a project the objectives are more the following: ease of development, homogeneity, simplicity, etc. and as time goes by (and as the artefacts increase) the preponderant objectives becomes: modularity, decoupling and autonomy. Nice point : Stephan talks about the domain architecture, that is often the most stable one.

Finally, Stephan talks about integration: especially UI and data. I left apart the UI one to focus on the data one with that pieces of wisdom shared by Stephan, Greg Young and Dan:

Each time you mutualize something between two components (code, data) you introduce a new coupling

 

The inverse of « DRY » (Don’t Repeat Yourself) is « Decoupled » !

That’s particularly true for data replication, often considered bad but replication permits more autonomy and avoid synchronous calls that directly impact availability and reliability. The solution is asynchronous: use events for notifying interested components of a change.

The second keyword I notice several times is « entanglement » (actually there are several keywords but I grouped them: cyclic dependencies, braid, intricacies). Reducing dependencies is a point that I promote for a long time. Dependency is often added as a micro level decision but every time one is added it augments the overall system complexity. I often use Structure 101 to manage the dependencies in the Java system I work with and I’m very satisfied with it.

To conclude that first day report I quote Rich Hickey, Clojure creator:

programmers know the benefits of everything but the trade-offs of nothing

Identity Data And State The Fundamentals Of Object And Functional Design And How To Manage Them

State management is a subject which is currently at the forefront of the debate in the software world. The functional paradigm as well as the reflection on the basics of the object paradigm make it so non-mastered changes in state are seen as the root of all troubles. In this article I will return to the notions of state and identity because I consider them to be crucial. What is a an object state ? What is an identity ? These concepts do apply whatever the used paradigm is, whether it’s object-oriented or functional-oriented.

Identity State Behavior

When I started my professional career as an intern, I had a discussion with my project manager over a UML model (she had a PhD in computing and was specialized in object-oriented stuff) and her first question was “But where is the dynamic part? Why isn’t there a state-transition diagram?”. It was a complete revelation for me and initiated a deep change in my approach of software system modeling: the behavior, what the system does, is just as important as the structure, what the system is. This balance between the static part and the dynamic part must always be respected in order to obtain a design in its rightful name, proof of a good maintainability. Some people even speak of the tao of programming, a balance between the Yin (the state) and the Yang (the behavior).

Data

Before tackling the notion of state, I’d like to come back on the notion of data which, composed through the object attributes, constitutes its state.

Data is a couple formed by a concept and a measure.

The concept which the data is referring to must have a precise definition making it possible to rally to its semantic.

The type of measure can be extremely varied, but can often be categorized as follows:

A data can be fixed (immutable) or not (mutable). The measure of the data changes all the time, so the value of the object attribute can be changed. This frequency of change can be elevated or not. It can be very visible or not. These different attributes allows to characterize each data and can be summarized in the following diagram:

Example

Different kind of data characteristics

This reminder of the notion of data is enough to start tackling the notion of state.

State

A state is the sum of data values which characterize an object at a precise moment in time.

In the widely accepted object-oriented paradigm the state of an object defines its structure. This state is important for the object because it will influence its behavior during a future interaction after a stimulus.

The « state » characteristics are as follows:

  • A state is the sum of the data values characterizing an object at a precise moment in time. An object is characterized by a group of attributes which are composed of data (concept and measure).
  • A state is sometimes summarized by a past participle with a static and stable signification.
  • A state lasts as long as no external stimuli interacts with the object holding the state.
  • The state influences the behavior of the object.
  • A state, in its definition, implies that there are transformations of the object between different states which intervene during the transitions of state.
  • A state is associated with a particular object which has an identity, in order to follow the different transitions of state.
  • A state is always signified by an attribute whose data is discrete and variable. By discrete I mean a continuous attribute but a discrete continuity. For example: the state “majority” of a person is discrete (adult, minor) but is obtained through a continuous attribute – the age of the person.

Example

The following diagram illustrates the states that can undergo a person:

The different states (not exhaustive) that a person can be at a given time

Every transition is triggered by an event which leads to the transformation of the considered object. Continuous data like age can be “subdued” by a threshold condition like the passage from the adult age and so can be transformed into a state.

Identity

Identity is what makes it possible to distinguish between oneself and the other. It’s a basic biological principle. For that matter, Alan Kay established his entire vision of the object paradigm on biology:

Alan Curtis Kay Alan Curtis Kay source

I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful)

The concept of identity for a person seems very natural to us. This identity starts at birth and ends in death, but can even go beyond that. What’s interesting is the transformations which undergo this person continuously: his age, his financial situation, even his name can change while at the same time maintaining that same identity. The object attributes change but the identity stays the same.

From the programming point of view, different objects can co-exist, with different attributes, within the same identity. It’s typically the case for a client, who can co-exist in very different contexts: in a marketing context to get to know his consumer habits, an accounting context to handle his payments, a logistical context to handle mailing, etc. Nevertheless, his identity will need to be precisely defined in order to be able to link different objects between them, representing each different aspect of the same identity. Does it matter to the software user to consider that this identity evolving in time is always the same one? Or is it acceptable to consider that they are different? It’s the main question that the creator must answer, because any software runs on a model, which is then only a representation of reality.

This leads us to the concept of entity which is defined as the association of an identity and a state.

An entity is the association of an identity and a state The object paradigm fuses these two concepts in the object, the functional paradigm isolates them, their fundamental differences are reached. The notions of identity and state, which constitute the entity, are nevertheless eminently temporal as shown in the following diagram:

Identity and State evolution across the time for a particular object

The identity is the invariant of the object in time and it unites the sum of its attributes, that is its state. In the functional paradigm, this notion of references from the identity to the attributes is the foundation of the persistent data structures. The functional paradigm brings this temporal dimension which, I think, is lacking in the object-oriented paradigm and brings a great ease to concurrency management.

Put into practice

Manage the states and their transition

Every object with a strong identity must provide the operations allowing the manipulation of its state in a controlled manner:

  • All the state transitions must manifest themselves through an operation on the object: for example, the operation “to divorce” on the object of a person.
  • Operations must make it possible to interrogate the state of the object (the famous “getters”).

Manage the identity

Every entity must provide an identification which is unique within the whole system even if its attributes are identical to another entity.

The identity needs above all a real reflection during the course of conception on the notions of difference and equality between objects. This notion of identity is part of the model and so every identity must define precisely its unifier and its equalizer, even in free text. Always be vigilant during exchanges with experts in this field on the equalities linked to object attributes that can hide a lack of reflection on the identities.

In certain cases, the identity will be composed of the combination of several attributes. For example, an episode of a TV series will be identified by its title, its season and the episode number (ex: “Numb3rs-Season6-Episode13”).

In other cases, a unique and unchangeable identifier will be generated and attributed to the object. This is the usual client number on your electricity bill. The generation of these identifiers is an entirely different subject from the technicality that it implies (concurrency, distribution). Finally, the generation of identifiers implies a reflection on the borders of the system in order to avoid collisions and in the course of any exchanges with another system, like when a person is identified by his social security number.

The attribution of the identifier often reveals the focus at the heart of the information system, and so telephone numbers as identifiers and not the person make any common actions difficult in relation to a client who posses several phone numbers.

Conclusion

In conclusion, the fundamental questions in an object design are:

  • What is it that makes two objects be considered identical or different?
  • What are the attributes which compose the object? What are the characteristics of these attributes? Most notably which are the ones that influence the behavior and how do they evolve in time?

These questions in connection with the identity and the state of the object are crucial for the realization of the domain model, which is at the base of any enterprise architecture project.