TITLE: OOD principles - abstraction and stability AUTHOR: rmartin@rcmcon.com (Robert Martin), 1 Sep 95 The dicsusson over principle #8 was very interesting. We discussed the realtionship between intrinsic stability and positional stability, and found that there is grounds for an equivalence between them. This next principle suggestst that stability ought also be a function of the composition of a category. That it is not enough simply to call a category stable, but that it be composed of entities that have high intrinsic stability; namely -- abstract classes. -------------------------------------------------------------------- 9. The more stable a class category is, the more it must consist of abstract classes. A completely stable category should consist of nothing but abstract classes. -------------------------------------------------------------------- The stability being referred to in this principle, is again the I metric which is based upon "positional" stability. That is, a module that is in a highly stable position in the dependency graph should also be highly abstract. The justification for this principle is based upon the idea that executable code (The implementations of methods) changes more often than the interfaces between modules. Therefore interfaces have more intrinsic stability than executable code. In C++, it is more likely that you will change a .cc file than a .h file. In some languages, the division between implementation and interface is very clear cut. But in others (like C++) it is blurred. The stability of the .h file is not as great as it could be because private functions and private variables are likely to need changing as the implementation evolves. In such languages, the maximum stability comes from pure interfaces. A class that contains pure interfaces is an abstract class. Thus, we can measure the abstraction of a category by computing the ratio of abstract classes to total classes: A = (# abstract classes) / (# of classes). For example, if a class category contains 10 classes, of which 6 are abstract then A = .6. A will always range from [0,1]. Principle 9 says that categories that are placed into positions of stability ought to be abstract. The reason for this that the higher the abstraction of the category, the higher its intrinsic stability. And since intrinsic stability is limited by positional stability, intrinsically stable modules should also have positional stability. Consider a category with very low abstraction. Such a category has mostly concrete classes in it. If it is put in a place of stability, then many other categories will depend upon it. And thus, it will be difficult to change its implementation. On the other hand, consider a category with very high abstraction. Such a category consists mostly of abstract classes. If we put this category into a position of very low stability, then very few other categories will depend upon it. And then, of what use is the abstract interface? But there is another reason to put abstractions in positions of stability. And that reason goes back to principle #1; the open closed principle. We want concrete categories to depend upon abstract categories so that the derivatives of those abstract categories can be controlled by the concrete categories. This is reuse. The concrete categories can be reused with different derivatives of the abstract categories. Also, when the implementations of those derivative categories change, the abstract category is not affected, and so the concrete categories that depend upon it are also unaffected. Thus, the concrete categories are open to be extended, but do not need to be modified in order to achieve that extension. The open closed principle leads us to the realization that we do not want all of our categories to be stable. Stability is inflexibility. A stable category is difficult to change. And we do not want our applications to be difficult to change. But the open closed principle allows that a module that is highly stable (closed for modification) can also be open for extension. Yet this can only happen if the extension takes place in a derivative of the stable abstraction. Thus, in the ideal world our models would consist of two kinds of categories. Completely abstract and stable categories that are depended upon by completely concrete and instable categories. We are now in a position to define the relationship between stability (I) and abstractness (A). We can create a graph with A on the vertical axis and I on the horizontal axis. If we plot the two "good" kinds of categories on this graph, we will find the categories that are maximally stable and abstract at the upper left at (0,1). The categories that are maximally instable and concrete are at the lower right at (1,0). But not all categories can fall into one of these two positions. Categories have degrees of abstraction and stability. For example, it is very common that one abstract class derives from another abstract class. The derivative is an abstraction that has a dependency. Thus, though it is maximally abstract, it will not be maximally stable. Its dependency will decrease its stability. Consider a category with A=0 and I=0. This is a highly stable and concrete category. Such a category is not desirable because it is rigid. It cannot be extended because it is not abstract. And it is very difficult to change because of its stability. Consider a category with A=1 and I=1. This category is also undesirable (perhaps impossible) because it is maximally abstract and yet has no dependents. It, too, is rigid because the abstractions are impossible to extend. But what about a category with A=.5 and I=.5? This category is partially extensible because it is partially abstract. Moreover, it is partially stable so that the extensions are not subject to maximal instability. Such a category seems "balanced". Its stability is in balance with its abstractness. Consider again the A-I graph (below). We can draw a line from (0,1) to (1,0). This line represents categories whose abstractness is "balanced" with stability. Because of its similarity to a graph used in astronomy, I call this line the "Main Sequence". | | 1= (0,1) |\ | \ Abstractness| \ The | \ Main | \ Sequence | \ | \ | \ (1,0) +--------:-- 1 Instability A category that sits on the main sequence is not "too abstract" for its stability, nor is "too instable" for its abstractness. It has the "right" number of concrete and abstract classes in proportion to its positional stability. Clearly, the most desirable positions for a category to hold are at one of the two endpoints of the main sequence. However, in my experience only about half the categories in a project can have such ideal characteristics. Those other categories have the best characteristics if they are on or close to the main sequence. This leads us to another metric. If it is desirable for categories to be on or close to the main sequence, we can create a metric which measures how far away a category is from this ideal. D : Distance : |(A+I-1)/root2| : The perpendicular distance of a category from the main sequence. This metric ranges from [0,~0.707]. (One can normalize this metric to range between [0,1] by using the simpler form |(A+I-1)|. I call this metric D'). Given this metric, a design can be analyzed for its overall conformance to the main sequence. The D metric for each category can be calculated. Any category that has a D value that is not near zero can be reexamined and restructured. Statistical analysis of a design is also possible. One can calculate the mean and variance of all the D metrics within a design. One would expect a conformant design to have a mean and variance which were close to zero. The variance can be used to establish "control limits" which can identify categories that are "exceptional" in comparison to all the others.