TITLE: OOD principles - abstraction and stability

AUTHOR: rmartin@rcmcon.com (Robert Martin), 1 Sep 95

The dicsusson over principle #8 was very interesting.  We discussed
the realtionship between intrinsic stability and positional stability,
and found that there is grounds for an equivalence between them.  

This next principle suggestst that stability ought also be a function
of the composition of a category.  That it is not enough simply to
call a category stable, but that it be composed of entities that have
high intrinsic stability; namely -- abstract classes.

--------------------------------------------------------------------
   9. The more stable a class category is, the more it must
      consist of abstract classes.  A completely stable category
      should consist of nothing but abstract classes.
--------------------------------------------------------------------

The stability being referred to in this principle, is again the I
metric which is based upon "positional" stability.  That is, a module
that is in a highly stable position in the dependency graph should
also be highly abstract.  

The justification for this principle is based upon the idea that
executable code (The implementations of methods) changes more often
than the interfaces between modules.  Therefore interfaces have more
intrinsic stability than executable code.  In C++, it is more likely
that you will change a .cc file than a .h file.

In some languages, the division between implementation and interface
is very clear cut.  But in others (like C++) it is blurred.  The
stability of the .h file is not as great as it could be because
private functions and private variables are likely to need changing as
the implementation evolves.  In such languages, the maximum stability
comes from pure interfaces.  A class that contains pure interfaces is
an abstract class.

Thus, we can measure the abstraction of a category by computing the
ratio of abstract classes to total classes:

      A = (# abstract classes) / (# of classes).

For example, if a class category contains 10 classes, of which 6 are
abstract then A = .6.  A will always range from [0,1].

Principle 9 says that categories that are placed into positions of
stability ought to be abstract.  The reason for this that the higher
the abstraction of the category, the higher its intrinsic stability.
And since intrinsic stability is limited by positional stability,
intrinsically stable modules should also have positional stability.

Consider a category with very low abstraction.  Such a category has
mostly concrete classes in it.  If it is put in a place of stability,
then many other categories will depend upon it.  And thus, it will be
difficult to change its implementation.  On the other hand, consider a
category with very high abstraction.  Such a category consists mostly
of abstract classes.  If we put this category into a position of very
low stability, then very few other categories will depend upon it.
And then, of what use is the abstract interface?

But there is another reason to put abstractions in positions of
stability.  And that reason goes back to principle #1; the open closed
principle.  We want concrete categories to depend upon abstract
categories so that the derivatives of those abstract categories can be
controlled by the concrete categories.  This is reuse.  The concrete
categories can be reused with different derivatives of the abstract
categories.  Also, when the implementations of those derivative
categories change, the abstract category is not affected, and so the
concrete categories that depend upon it are also unaffected.  Thus,
the concrete categories are open to be extended, but do not need to be
modified in order to achieve that extension.

The open closed principle leads us to the realization that we do not
want all of our categories to be stable.  Stability is inflexibility.
A stable category is difficult to change.  And we do not want our
applications to be difficult to change.  But the open closed principle
allows that a module that is highly stable (closed for modification)
can also be open for extension.  Yet this can only happen if the
extension takes place in a derivative of the stable abstraction.
Thus, in the ideal world our models would consist of two kinds of
categories.  Completely abstract and stable categories that are
depended upon by completely concrete and instable categories.

We are now in a position to define the relationship between stability
(I) and abstractness (A).  We can create a graph with A on the
vertical axis and I on the horizontal axis.  If we plot the two "good"
kinds of categories on this graph, we will find the categories that
are maximally stable and abstract at the upper left at (0,1).  The
categories that are maximally instable and concrete are at the lower
right at (1,0).

But not all categories can fall into one of these two positions.
Categories have degrees of abstraction and stability.  For example, it
is very common that one abstract class derives from another abstract
class.  The derivative is an abstraction that has a dependency.  Thus,
though it is maximally abstract, it will not be maximally stable.  Its
dependency will decrease its stability.

Consider a category with A=0 and I=0.  This is a highly stable and
concrete category.  Such a category is not desirable because it is
rigid.  It cannot be extended because it is not abstract.  And it is
very difficult to change because of its stability.

Consider a category with A=1 and I=1.  This category is also
undesirable (perhaps impossible) because it is maximally abstract and
yet has no dependents.  It, too, is rigid because the abstractions are
impossible to extend.

But what about a category with A=.5 and I=.5?  This category is
partially extensible because it is partially abstract.  Moreover, it
is partially stable so that the extensions are not subject to maximal
instability.  Such a category seems "balanced".  Its stability is in
balance with its abstractness.

Consider again the A-I graph (below).  We can draw a line from (0,1)
to (1,0).  This line represents categories whose abstractness is
"balanced" with stability.  Because of its similarity to a graph used
in astronomy, I call this line the "Main Sequence".

               |
               |
	      1= (0,1)
	       |\
	       | \
   Abstractness|  \ The
	       |   \ Main
	       |    \ Sequence
	       |     \
	       |      \
	       |       \  (1,0)
	       +--------:--
	        	1
	          Instability


A category that sits on the main sequence is not "too abstract" for
its stability, nor is "too instable" for its abstractness.  It has the
"right" number of concrete and abstract classes in proportion to its
positional stability.  Clearly, the most desirable positions for a
category to hold are at one of the two endpoints of the main sequence.
However, in my experience only about half the categories in a project
can have such ideal characteristics.  Those other categories have the
best characteristics if they are on or close to the main sequence.

This leads us to another metric.  If it is desirable for categories
to be on or close to the main sequence, we can create a metric which
measures how far away a category is from this ideal.

D : Distance : |(A+I-1)/root2| : The perpendicular distance of a
    category from the main sequence.  This metric ranges from
    [0,~0.707].  (One can normalize this metric to range between [0,1]
    by using the simpler form |(A+I-1)|.  I call this metric D').

Given this metric, a design can be analyzed for its overall
conformance to the main sequence.  The D metric for each category can
be calculated.  Any category that has a D value that is not near zero
can be reexamined and restructured.

Statistical analysis of a design is also possible.  One can calculate
the mean and variance of all the D metrics within a design.  One would
expect a conformant design to have a mean and variance which were
close to zero.  The variance can be used to establish "control limits"
which can identify categories that are "exceptional" in comparison to
all the others.