TITLE: What is virtual abuse?

RESPONSE: jamshid (Jamshid Afshar), 17 Aug 94

Yesterday I mailed out a question about a piece of mis-commented code
in design/tcatalognode.h.  I got an answer but I want to use this
example as an opportunity to ramble a bit on a problem I see often in
our code.

>TCatalogNode and its base TCommonNode have the following commented
>virtual function:
>
>    // Return the name by which we will refer to the node
>    // (the specName, if not blank, otherwise, the generatedName).
>    //
>    virtual const TString& name() const;
>
>The TCatalogNode implementation, though, is just:
>
>	const TString& TCatalogNode :: name() const
>	{
>	    return specName();
>	} // name
>
>The base class TCommonNode::name() does return the generated name if
>specName() is blank.  So can I just remove the TCatalogNode override?

I got an answer and it is a definite NO.  The comments for both
TCommonNode::name() and TCatalogNode::name() should be changed,
though, to reflect the fact that although the default implementation
in TCommonNode will return the generated name, derived classes may
override it and return a blank string.

Apparently a catalog node "name" is used in various places where we do
not want the user to see the unique (but ugly) generated name.  I
guess those places either use the blank string, or they use a
different string (not the generated name) when they notice it is
blank.  I'll try to fix the comments when I'm back on the trunk.

I think this example reflects a bigger, general problem in our C++
code than just out-of-sync comments: we use virtual functions very
often but do not define well their semantics.  Without this guidance a
virtual function loses its utility and becomes a serious problem
during code changes.  Users of the class with an inadequately defined
virtual function don't want to or can't call it because they have no
idea what some derived classes will do with it.  If they do use the
function, they often end up `switch'ing based on the object type for
cases where a derived class doesn't do what they thought it should do.
Or, they may need to `switch' because an overriden function requires
the object (or some related object) to be in a different state than
what the original definition of the function required.  This type
switching is exactly the maintenance headache virtual functions were
designed to avoid.

It's extremely important to define the intended semantics and pre &
post conditions when you make a function virtual.  This extra design
work (and the inexact forecasting it often requires) is the reason I
avoid making functions virtual unless I know why a derived class would
want to override it.  Making functions virtual simply "by default"
without a clear (documented) idea of why or how someone would want to
override it and without documenting the original design's pre and post
conditions leads to hard to maintain code.

I like the approach Eiffel takes of making "constraints" an integral
part of the language.  You define the pre and post conditions for each
method of a class.  Derived classes may only loosen the preconditions
or constrain further the post conditions, never the other way around.
If a derived class' implementation really has some new preconditions,
they must be "moved up" the hierarchy.  It may require some work to
make sure this new precondition is always met, but you're then left
with a coherent program.  If you are calling a function and expecting
it to do something or leave the object in a state not actually
guaranteed by its postconditions, but this postcondition is
implemented, consider formally adding this requirement to the
postconditions.

To sum up, when designing a class don't think "well, I don't know what
another user might want to do with this member function, so I'll just
make it virtual".  That's exactly the reason NOT to let users override
that function in their derived classes.  If you don't have any idea
how it should be overriden, how do you expect other users to override
it "correctly" and preserve the semantics you, or any reasonable user
of your class, expects?

When deriving from a class, be careful to preserve the semantics of
the parent virtual function.  If there's no documentation, there's no
better time (except the past ;-) to add some comments!  Don't use
virtual functions to "hijack" a class.

If you're considering overriding or calling an existing virtual
function but the function is either not well-defined or doesn't really
do what you want it to do, don't be afraid to redesign a bit at the
base class by defining a new function which does provide the interface
you really intend.  It is of course a good idea to talk with a couple
of people before making such changes -- someone else may either show
you why you shouldn't take that approach, or they may have already
been thinking about such a change and can offer advice.

For example, instead of TCatalogNode overriding name(), maybe we
should have defined a new function in TCommonNode like prettyName()
which returned "a name that we'd want the user to see to represet the
node, with no guarantees on whether it can later be successfully
passed to a findNode() function".  Or maybe we should have simply not
defined name() and let users instead use the well-defined(?)
specName() or myGeneratedName().  A good class design is useless
without good documentation showing how to use or extend it.  Even a
design that turns out to be inadequate in the future can usually be
changed or reimplemented easily if we know enough about it.