State management is a subject which is currently at the forefront of the debate in the software world. The functional paradigm as well as the reflection on the basics of the object paradigm make it so non-mastered changes in state are seen as the root of all troubles. In this article I will return to the notions of state and identity because I consider them to be crucial. What is a an object state ? What is an identity ? These concepts do apply whatever the used paradigm is, whether it’s object-oriented or functional-oriented.
When I started my professional career as an intern, I had a discussion with my project manager over a UML model (she had a PhD in computing and was specialized in object-oriented stuff) and her first question was “But where is the dynamic part? Why isn’t there a state-transition diagram?”. It was a complete revelation for me and initiated a deep change in my approach of software system modeling: the behavior, what the system does, is just as important as the structure, what the system is. This balance between the static part and the dynamic part must always be respected in order to obtain a design in its rightful name, proof of a good maintainability. Some people even speak of the tao of programming, a balance between the Yin (the state) and the Yang (the behavior).
Before tackling the notion of state, I’d like to come back on the notion of data which, composed through the object attributes, constitutes its state.
Data is a couple formed by a concept and a measure.
The concept which the data is referring to must have a precise definition making it possible to rally to its semantic.
The type of measure can be extremely varied, but can often be categorized as follows:
A data can be fixed (immutable) or not (mutable). The measure of the data changes all the time, so the value of the object attribute can be changed. This frequency of change can be elevated or not. It can be very visible or not. These different attributes allows to characterize each data and can be summarized in the following diagram:
This reminder of the notion of data is enough to start tackling the notion of state.
A state is the sum of data values which characterize an object at a precise moment in time.
In the widely accepted object-oriented paradigm the state of an object defines its structure. This state is important for the object because it will influence its behavior during a future interaction after a stimulus.
The « state » characteristics are as follows:
- A state is the sum of the data values characterizing an object at a precise moment in time. An object is characterized by a group of attributes which are composed of data (concept and measure).
- A state is sometimes summarized by a past participle with a static and stable signification.
- A state lasts as long as no external stimuli interacts with the object holding the state.
- The state influences the behavior of the object.
- A state, in its definition, implies that there are transformations of the object between different states which intervene during the transitions of state.
- A state is associated with a particular object which has an identity, in order to follow the different transitions of state.
- A state is always signified by an attribute whose data is discrete and variable. By discrete I mean a continuous attribute but a discrete continuity. For example: the state “majority” of a person is discrete (adult, minor) but is obtained through a continuous attribute – the age of the person.
The following diagram illustrates the states that can undergo a person:
Every transition is triggered by an event which leads to the transformation of the considered object. Continuous data like age can be “subdued” by a threshold condition like the passage from the adult age and so can be transformed into a state.
Identity is what makes it possible to distinguish between oneself and the other. It’s a basic biological principle. For that matter, Alan Kay established his entire vision of the object paradigm on biology:
I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful)
The concept of identity for a person seems very natural to us. This identity starts at birth and ends in death, but can even go beyond that. What’s interesting is the transformations which undergo this person continuously: his age, his financial situation, even his name can change while at the same time maintaining that same identity. The object attributes change but the identity stays the same.
From the programming point of view, different objects can co-exist, with different attributes, within the same identity. It’s typically the case for a client, who can co-exist in very different contexts: in a marketing context to get to know his consumer habits, an accounting context to handle his payments, a logistical context to handle mailing, etc. Nevertheless, his identity will need to be precisely defined in order to be able to link different objects between them, representing each different aspect of the same identity. Does it matter to the software user to consider that this identity evolving in time is always the same one? Or is it acceptable to consider that they are different? It’s the main question that the creator must answer, because any software runs on a model, which is then only a representation of reality.
This leads us to the concept of entity which is defined as the association of an identity and a state.
An entity is the association of an identity and a state The object paradigm fuses these two concepts in the object, the functional paradigm isolates them, their fundamental differences are reached. The notions of identity and state, which constitute the entity, are nevertheless eminently temporal as shown in the following diagram:
The identity is the invariant of the object in time and it unites the sum of its attributes, that is its state. In the functional paradigm, this notion of references from the identity to the attributes is the foundation of the persistent data structures. The functional paradigm brings this temporal dimension which, I think, is lacking in the object-oriented paradigm and brings a great ease to concurrency management.
Put into practice
Manage the states and their transition
Every object with a strong identity must provide the operations allowing the manipulation of its state in a controlled manner:
- All the state transitions must manifest themselves through an operation on the object: for example, the operation “to divorce” on the object of a person.
- Operations must make it possible to interrogate the state of the object (the famous “getters”).
Manage the identity
Every entity must provide an identification which is unique within the whole system even if its attributes are identical to another entity.
The identity needs above all a real reflection during the course of conception on the notions of difference and equality between objects. This notion of identity is part of the model and so every identity must define precisely its unifier and its equalizer, even in free text. Always be vigilant during exchanges with experts in this field on the equalities linked to object attributes that can hide a lack of reflection on the identities.
In certain cases, the identity will be composed of the combination of several attributes. For example, an episode of a TV series will be identified by its title, its season and the episode number (ex: “Numb3rs-Season6-Episode13”).
In other cases, a unique and unchangeable identifier will be generated and attributed to the object. This is the usual client number on your electricity bill. The generation of these identifiers is an entirely different subject from the technicality that it implies (concurrency, distribution). Finally, the generation of identifiers implies a reflection on the borders of the system in order to avoid collisions and in the course of any exchanges with another system, like when a person is identified by his social security number.
The attribution of the identifier often reveals the focus at the heart of the information system, and so telephone numbers as identifiers and not the person make any common actions difficult in relation to a client who posses several phone numbers.
In conclusion, the fundamental questions in an object design are:
- What is it that makes two objects be considered identical or different?
- What are the attributes which compose the object? What are the characteristics of these attributes? Most notably which are the ones that influence the behavior and how do they evolve in time?
These questions in connection with the identity and the state of the object are crucial for the realization of the domain model, which is at the base of any enterprise architecture project.