TITLE: Inheritance and traits

(Newsgroup: comp.lang.c++.moderated, 5 Aug 2000)

AUTHOR: Herb Sutter <hsutter@peerdirect.com>

 --------------------------------------------------------------------
   Guru of the Week problems and solutions have appeared regularly
   on news:comp.lang.c++.moderated since Feb 1997. For past issues,
     see the GotW archive at http://www.peerdirect.com/resources.
  (c)2000 H P Sutter. News archives may keep copies of this article.
 --------------------------------------------------------------------

_______________________________________________________

GotW #71:   Inheritance Traits?

Difficulty: 5 / 10
_______________________________________________________


Traits and Extensibility
------------------------

>1. What is a traits class?

Quoting 17.1.18 in the C++ standard, a traits class is:

   "a class that encapsulates a set of types and functions necessary
   for template classes and template functions to manipulate objects
   of types for which they are instantiated."

The idea is that traits classes are templates used to carry extra
information -- especially information that templates can use -- about
the classes on which the traits template is instantiated.  The nice
thing is that the traits class T<C> tacks on said extra information to
class C without requiring any change at all to C. Despite all the talk
about "tacking on," traits are quite useful -- not "tacky" at all.

For examples, see:

   - Items 2 and 3 in Exceptional C++.[1]

   - GotW #62 on "Smart Pointer Members."[2]

   - The April, May, and June 2000 issues of C++ Report,
     which contained several excellent columns about traits.

   - The C++ standard library's own char_traits, iterator
     categories, and similar mechanisms.


Requiring Member Functions
--------------------------

>2. Demonstrate how to detect and make use of class members within
>   templates, using the following motivating case: You want to write
>   a templated class C that can only be instantiated on types having a
>   function named Clone that takes no parameters and returns a pointer
>   to the same kind of object.
>
>    template<class T>   // T must provide T* T::Clone()
>    class C
>    {
>      // ...
>    };
>
>   Note: It's obvious that if C just writes code that tries to invoke
>   T::Clone() without parameters, then such code will fail to compile
>   if there isn't a T::Clone() that can be called without parameters.

For an example to illustrate that last note, consider:

  // Example 2(a): Initial attempt, sort of requires Clone()
  //
  template<class T>   // T must provide /*...*/ T::Clone( /*...*/ )
  class C
  {
  public:
    void SomeFunc( T* t )
    {
      // ...
      t->Clone();
      // ...
    }
  };

The first problem with Example 2(a) is that it doesn't necessarily
require anything at all -- in a template, only the member functions
that are actually used will be instantiated, or even parsed for that
matter.[3] If SomeFunc() is never used, it will never be instantiated
and so C can easily be instantiated with T's that don't have anything
resembling Clone().

The solution is to put the code that enforces the requirement into a
function that's sure to be instantiated.  The first thing most people
think of is to put it in the constructor, because of course it's
impossible to use C without invoking its constructor somewhere, right?
True enough, but there could be multiple constructors and then to be
safe we'd have to put the requirement-enforcing code into every
constructor.  There's a much easier solution, namely:  Put it in the
destructor.  There's only one destructor, and it's equally impossible
to use C without invoking its destructor, so that's the simplest place
for the requirement-enforcing code to live.

  // Example 2(b): Revised attempt, requires Clone()
  //
  template<class T>   // T must provide /*...*/ T::Clone( /*...*/ )
  class C
  {
  public:
    ~C()
    {
      // ...
      T t;          // kind of wasteful
      t.Clone();
      // ...
    }
  };

This leaves us with the second problem:  Both Examples 2(a) and 2(b)
don't so much test the constraint as simply rely on it.  (In the case
of Example 2(b), it's even worse because 2(b) does it in a wasteful
way that adds unnecessary runtime code just to try to enforce a
constraint.)  As noted in the question statement itself, continuing on:

>   But that's not enough to answer this question: Just trying to call
>   T::Clone() without parameters would also succeed in calling a
>   Clone() that has defaulted parameters and/or does not return a T*.

The code in Examples 2(a) and 2(b) will indeed work most swimmingly if
there is a function "T* T::Clone();".  The problem is that it will
also work most swimmingly if there is a function "void T::Clone();",
or "T* T::Clone( int = 42 );", or other variant signature, as long as
it can be called without parameters.  (For that matter, it will also
work even if there isn't a Clone() member function at all, as long as
there's a macro that changes the name Clone to something else, but
there's little we can do about that.)

All that may be fine in some applications, but it's not what the
question asked for.  What we want to achieve is stronger:

>   The goal here is to specifically enforce that T provide a function
>   that looks exactly like this: T* T::Clone().

So here's one way we can do it:

  // Example 2(c): Better, requires exactly T* T::Clone()
  //
  template<class T>   // T must provide T* T::Clone()
  class C
  {
  public:
    // in C's destructor (easier than putting it in every C ctor):
    ~C()
    {
      T* (T::*test)() const = T::Clone;
      test; // suppress warnings about unused variables
        // this unused variable is likely to be optimized away entirely

      // ...
    }

    // ...
  };

Or, a little more cleanly and extensibly:

  // Example 2(d): Alternative way of requiring exactly T* T::Clone()
  //
  template<class T>   // T must provide T* T::Clone()
  class C
  {
    bool ValidateRequirements()
    {
      T* (T::*test)() const = T::Clone;
      test; // suppress warnings about unused variables
      // ...
      return true;
    }

  public:
    // in C's destructor (easier than putting it in every C ctor):
    ~C()
    {
      assert( ValidateRequirements() );
    }

    // ...
  };

Having a ValidateRequirements() function is extensible -- it gives us
a nice clean place to add any future requirements checks.  Calling it
within an assert() further ensures that all traces of the requirements
machinery will disappear from release builds.


Requiring Inheritance
---------------------

NOTE:  I have seen the following ideas in several places -- I think
that two of those places were in published or unpublished articles by
Andrei Alexandrescu and Bill Gibbons.  My apologies if the code I'm
about to show looks really similar to someone's published code; I'm
reusing other people's good ideas, but writing this code off the top
of my head.

>3. A programmer wants to write a template that can require (or just
>   detect) whether the type on which it is instantiated has a Clone()
>   member function.  The programmer decides to do this by requiring
>   that classes offering such a Clone() must derive from a fixed
>   Clonable class.  Demonstrate how to write this template:
>
>    template<class T>
>    class X
>    {
>      // ...
>    };
>
>   a) to require that T be derived from Clonable; and

First, we define a helper template that tests whether a candidate type
D is derived from B. It determines this by determining whether a
pointer to D can be converted to a pointer to B:

  // Example 3(a): An IsDerivedFrom helper
  //
  template<class D, class B>
  class IsDerivedFrom
  {
  private:
    class Yes { char a[1];  };
    class No  { char a[10]; };

    static Yes Test( B* );  // undefined
    static No  Test( ... ); // undefined

  public:
    enum { Is = sizeof(Test(static_cast<D*>(0))) == sizeof(Yes) ? 1 : 0
};
  };

Get it?  Think about this code for a moment before reading on.

The above trick relies on three things:

 - Yes and No have different sizes. I may just be being paranoid, but
   the reason I don't use just char[1] and char[2] is the off chance
   that the sizes of Yes and No might be the same, for example if the
   compiler happened to require that an object's size be a multiple of
   four bytes. I doubt it would ever happen, but I can't see any
   wording in the standard that would prohibit it.

 - Overload resolution and determining the value of sizeof are both
   performed at compile time, not runtime.

 - Enums are initialized, and values can be used, at compile time.

Let's analyze the enum definition in a little more detail.  First, the
innermost part is:

                       Test(static_cast<D*>(0))

All this does is mention a function named Test and pretend to pass it
a D* -- in this case, a suitably cast null pointer will do.  Note that
nothing is actually being done here, and no code is being generated,
so the pointer is never dereferenced or for that matter even ever
actually created.  All we're doing is creating a typed expression.
Now, the compiler knows what D is, and will apply overload resolution
at compile time to decide which of the two overloads of Test() ought
to be chosen:  If a D* can be converted to a B*, then Test( B* ),
which returns a Yes, would get selected; otherwise, Test( ...  ),
which returns a No, would get selected.

The next step is to check which overload would get selected:

                sizeof(Test(static_cast<D*>(0))) == sizeof(Yes) ? 1 : 0

This expression, still evaluated entirely at compile time, will yield
1 if a D* can be converted to a B*, and 0 otherwise.  And that's
pretty much all we want to know, because a D* can be converted to a B*
if and only if D is derived from B (or D is the same as B, but we'll
plug that hole presently).

So, now that we've calculated what we need to know, we just need to
store the result someplace.  The said "someplace" has to be a place
that can be set and the value used all at compile time.  Fortunately,
an enum fits the bill nicely:

    enum { Is = sizeof(Test(static_cast<D*>(0))) == sizeof(Yes) ? 1 : 0
};

Finally, there is still that potential hole when D is the same as B,
and depending on the way we plan to use IsDerivedFrom we may not want
this template to report that a class is derived from itself (a lie, if
perhaps a benign one in some applications).  If we do need to plug the
hole, we can do it easily by partially specializing the template to
say that a class is not derived from itself:

  // Example 3(a), continued
  //
  template<class T>
  class IsDerivedFrom<T,T>
  {
  public:
    enum { Is = 0 };
  };

That's it.  We can now use this facility to help build an answer to
the question, to wit:

  // Example 3(a), continued: Using IsDerivedFrom to enforce
  //                          derivation from Clonable
  //
  template<class T>
  class X
  {
    bool ValidateRequirements() const
    {
      typedef IsDerivedFrom<T, Clonable> Y; // needed because of the ,
      assert( Y::Is );
        // a runtime check, but one that can be turned
        // into a compile-time check without much work

      return true;
    }

  public:
    // in X's destructor (easier than putting it in every X ctor):
    ~X()
    {
      assert( ValidateRequirements() );
    }

    // ...
  };


Selecting Alternative Implementations
-------------------------------------

Well, the solution in Example 3(a) is nice and all, and it'll make
sure T must be a Clonable, but what if T isn't a Clonable?  What if
there were some alternative action we could take?  Perhaps we could
make things even more flexible -- which brings us to the second part
of the question.

>   b) to provide an alternative implementation if T is derived from
>      Clonable, and work in some default mode otherwise.

To do this, we introduce the proverbial extra level of indirection, in
this case a helper template.  In short, X will use IsDerivedFrom, and
use partial specialization of the helper to switch between is-Clonable
and isn't-Clonable implementations:

  // Example 3(b): Using IsDerivedFrom to make use of derivation
  //               from Clonable if available, and do something else
  //               otherwise.
  //
  template<class T, int>
  class XImpl
  {
    // general case: T is not derived from Clonable
  };

  template<class T>
  class XImpl<T, 1>
  {
    // T is derived from Clonable
  };

  template<class T>
  class X
  {
    XImpl<T, IsDerivedFrom<T, Clonable>::Is> impl_;

    // ... delegates to impl_ ...
  };

Do you see how this works?  Let's work through it with a quick
example:

  class MyClonable : public Clonable { /*...*/ };

  X<MyClonable> x1;

X<T>'s impl_ has type XImpl<T, IsDerivedFrom<T, Clonable>::Is>.  In
this case, T is MyClonable, and so X<MyClonable>'s impl_ has type
XImpl<MyClonable, IsDerivedFrom<MyClonable, Clonable>::Is>, which
evaluates to XImpl<MyClonable, 1>, which uses the specialization of
XImpl that makes use of the fact that MyClonable is derived from
Clonable.  But what if we instantiate X with some other type?
Consider:

  X<int> x2;

Now T is int, and so X<int>'s impl_ has type XImpl<MyClonable,
IsDerivedFrom<int, Clonable>::Is>, which evaluates to
XImpl<MyClonable, 0>, which uses the unspecialized XImpl.  Nifty,
isn't it?

Note that at most XImpl<T,0> and XImpl<T,1> will ever be instantiated
for any given T. Even though XImpl's second parameter could
theoretically take any integer value, the way we've set things up here
the integer can only ever be 0 or 1. (In that case, why not use a bool
instead of an int?  Extensibility:  It doesn't hurt to use an int, and
doing so allows additional alternative implementations to be added
easily in the future.)


Requirements vs. Traits
-----------------------

>4. Is the approach in #3 the best way to require/detect the
>   availability of a Clone()?  Describe alternatives.

The approach in #3 is nifty, but I tend to like traits better in many
cases -- they're about as simple (except when they have to be
specialized for every class in a hierarchy), and they're more
extensible as shown in GotW #62.

The idea is to create a traits template whose sole purpose in life, in
this case, is to implement a Clone() operation.  The traits template
looks a lot like XImpl, in that there'll be a general-purpose
unspecialized version that does something general-purpose, and
possibly multiple specialized versions that deal with classes that
provide better or just different ways of cloning.

  // Example 4: Using traits instead of IsDerivedFrom to make use
  //            of Clonability if available, and do something else
  //            otherwise.  Requires writing a specialization for
  //            each Clonable class.
  //
  template<class T>
  class XTraits
  {
    // general case: use copy constructor
    static T* Clone( const T* p ) { return new T( *p ); }
  };

  template<>
  class XTraits<MyClonable>
  {
    // MyClonable is derived from Clonable, so use Clone()
    static T* Clone( const T* p ) { return p->Clone(); }
  };

  // ... etc. for every class derived from Clonable

X<T> then simply calls XTraits<T>::Clone() where appropriate, and it
will do the right thing.

The main difference between traits and the plain old XImpl shown in
Example 3(b) is that, with traits, when the user defines some new type
the most work they have to do to use it with X is to specialize the
traits template to "do the right thing" for the new type.  That's more
extensible than the relatively hard-wired approach in #3 above, which
does all the selection inside the implementation of XImpl instead of
opening it up for extensibility.  It also allows for other cloning
methods besides a function specifically named Clone() inherited from a
specifically named base class, and this too provides extra
flexibility.

For more details, including a longer sample implementation of traits
for a very similar example, see GotW #62, Examples 3(c)(i) and
3(c)(ii).


Hierarchy-Wide Traits
---------------------

The main drawback of the traits approach above is that it requires
individual specializations for every class in a hierarchy.  There are
ways to provide traits for a whole hierarchy of classes at a time,
instead of tediously writing lots of specializations.  See Andrei
Alexandrescu's excellent column in the June 2000 C++ Report, where he
describes a nifty technique to do just this.[4]  Andrei's technique
requires minor surgery on the base class of the outside class
hierarchy, in this case Clonable.  It would be nice if we could
specialize XTraits for the whole Clonable hierarchy in one shot
without requiring any change to Clonable -- this is a topic for a
potential future issue of GotW.


Inheritance vs. Traits
----------------------

>5. Can a template benefit significantly from knowing that is parameter
>   type T is inherited from some other type, in a way that could not
>   be achieved at least as well otherwise without the inheritance
>   relationship?

As far as I can tell at this writing, there is little extra benefit a
template can gain from knowing that one of its template parameters is
derived from some given base class that the template couldn't gain
more extensibly via traits.  The only real drawback to using traits is
that it can require writing lots of traits specializations to handle
many classes in a big hierarchy, but there are techniques that
mitigate or eliminate this drawback.

A principal motivator for this GotW issue was to demonstrate that
"using inheritance for categorization in templates" is perhaps not as
necessary a reason to use inheritance as some have thought.  Traits
provide a more general mechanism that's much more extensible when it
comes time to instantiate an existing template on new types, such as
types that come from a third-party library, that may not be easy to
derive from a foreordained base class.


[1] Sutter, H. "Exceptional C++" (Addison-Wesley, 2000).

[2] Available at http://www.peerdirect.com/resources/gotw062a.html.

[3] I'm not sure that all compilers get this rule right yet. Yours may
well instantiate all functions, not just the ones that are used.

[4] Alexandrescu, A. "Traits on Steroids" (C++ Report 12(6), June
2000).


---
Herb Sutter (mailto:hsutter@peerdirect.com)

CTO, PeerDirect Inc. (http://www.peerdirect.com)
Chair, ANSI SQL Part 12: SQL/Replication
Contributing Editor, C/C++ Users Journal (http://www.cuj.com)

      [ Send an empty e-mail to c++-help@netlab.cs.rpi.edu for info ]
      [ about comp.lang.c++.moderated. First time posters: do this! ]


_______________________________________________
cpptips mailing list
http://cpptips.hyperformix.com