TITLE: Musings on software metrics


RESPONSE: rmartin@rcmcon.com (Robert Martin), 21 Jun 94

I have been working on a set of design quality metrics for some time
now.  I have had favorable results with metrics that measure the
ratio of source code dependencies to abstraction.

The metrics depend on the premise that classes should be clustered
together into tightly bound groups (Booch calls them categories.)
These class categories are the unit of reusabilty, and the unit of
release.  Source code dependencies are tracked between categories.
The metrics are as follows:

Ca ("a" stands for "Afferent") is the number of classes from other
categories that depend upon this category.

Ce ("e" stands for "Efferent") is the number of classes in other
categories that the classes in this category depend upon.

A is the number of abstract classes in this category divided by the
total number of classes in this category.  Thus A is a number from 0
to 1.

I is a measure of the instability of the category.  Its formula is:
Ce/(Ce+Ca).  This is called instability because: 1. A high Ca means
that many other classes depend upon this category making this category
hard to change i.e. stable.  and 2. A low Ce means that there are few
classes that this category depends upon, thus few sources for external
change.  Thus stable again.  I is a number from 0 (highly stable) to 1
(highly instable).

If we plot A vs I on a graph we find two nodes.  A=1 and I=0
represents a category which is highly abstract and completely
independent.  This is good.  Abstract categories should not depend
upon anybody else since if they are to be reused, they would have to
drag along everything else that they depend upon.  The other node is
A=0 and I=1.  This node represents categories which are completely
detailed and depend heavily upon other categories.  This is good too
since the details of an application should be highly dependent.    In
the perfect application, the detailed categories would depend upon the
abstract categories, and the abstract categories would be independent.

But the world is not perfect and there are degrees of abstraction and
Instability.  Thus we can connect these two nodes with a line:

                   |
         A=1, I=0  *
                   | \
                   |   \    The main sequence
                   |	 \
                A  |       \
                   +---------*------
		      I      A=0, I=1

I call this line "the main sequence" after a similar structure in
astronomy.  The main sequence represents the ratio between Abstraction
and Instability that a well structure class category should sit upon.
If the A and I metrics of a category place it upon the main sequence,
then its level of abstraction is appropriate to its dependencies.

This leads us to the final metric: D which is the distance of the
category from the main sequence.  D=(abs(A+I-1)/sqrt(2))  Thus D
ranges from 0 to .707...

I will tolerate D metrics as high as .2 or .3 at times.  But most
should be zero and most of the rest should be less than .1.  This has
worked out pretty well for me.