TITLE: Per-class or per-object encapsulation?


PROBLEM: rob@dutiag.twi.tudelft.nl (Rob Verver)

I believe that per-class encapsulation can cause some problems for
reuse, even in languages not featuring sub-typing, such as Eiffel
and C++.

Per-class encapsulation states that the methods of a particular
class C cannot only access some private feature f of the receiver
object, but also that of every other instance of C, when passed
as a parameter. This is what causes the problem.


[ Most of you will probably want to read only the short version
  below. Gluttons-for-punishment can read the long version. -adc ]


RESPONSE: cline@sun.soe.clarkson.edu (Marshall Cline), 11 Feb 95

According to my experience teaching OO to thousands of people, this
comment (question?) is actually fairly common.  So common, in fact,
that I should probably put this answer in the C++ FAQ.  Sigh.

Short answer: Per-object encapsulation tends to force an object to
publicaly expose its private parts.  With dynamically typed OO
programming languages (Smalltalk, CLOS, etc), per-object encapsulation
tends to gain reusability at the expense of encapsulation, while
per-class encapsulation allows programmers to make that tradeoff in
either direction.  With statically typed OO programming languages
(C++, Eiffel, etc), there is much less of a tradeoff: per-object
encapsulation is generally worse than per-class encapsulation (which
is why most statically typed OO programming languages opt for
per-class encapsulation).

Long answer:

Premise: An object's interface should not be dependent on the private
data structure used to implement the object.  E.g., a Stack object
might be expected to have a getNumElems method independent of whether
it had a private numElems data member, but it would normally not have
a getList method even if it happened to have a private List data
member.

Definition: "Per-Class Encapsulation" gives a method of class X direct
access to any object which is known to be of class X ("direct access"
means the ability to access the private parts of an object without
going through some access method).

Definition: "Per-Object Encapsulation" gives a method of class X
direct access to the this (or self) object, but denies the method
direct access to any other X objects (such as to parameter-objects).

Consider the following copyFrom method as prototypical of methods that
accept another object of their class as a parameter (in C++, these
would be called the copy constructor and assignment operator):

	//pigin C++

	class Stack {
	public:
	  //...

	  getNumElems()
	  {
	    //This "get" method belongs in the public /interface since:
	    // (1) it doesn't expose the implementation technique, and
	    // (2) users might find it useful.
	    //...
	  }

	  copyFrom(const Stack& from)
	  {
	    //If copyFrom can't access "from.data" directly, it would
	    //need a public method to access "from.data" (such as the
	    //"getList()" method; see below).
	    //...
	  }

	  List& getList()
	  {
	    //This "get" method does not "belong" in the public
	    //interface (it exposes the implementation technique),
	    //yet per-object encapsulation would make it needed...
	    return data;  //yuk
	  }

	private:
	  List data;
	};

Conclusion: Although pushing encapsulation down to the object
granularity rather than the class granularity seems to improve
encapsulation, in practice it can actually degrade encapsulation,
since it forces some classes to artifically expose their private
parts.

Warning: Some developers define a public "get" and/or "set" method for
every (or nearly every) private datum, even if there is no
implementation independent need for the class's interface to have such
a public "get" and/or "set" method.  In statically typed languages,
this practice can weaken the advantages of encapsulation.  Our book,
"C++ FAQs" (Addison-Wesley 1995), discusses this OO design topic in
the chapters on Interface Design.

Although many dynamically typed OO languages have per-object
encapsulation, they could also have per-class encapsulation with a bit
of trickery.  One could imagine a dynamically typed OO language where
a method of class X would (dynamically) check that a parameter object
is really of class X, after which the method would be granted direct
access to the parameter object's state (provided of course that the
method didn't attempt to re-bind the parameter so it refered to a
different object).

For example, suppose the programmer was willing to require the
parameter of method "X::copyFrom(anX)" to actually be of class X (or
perhaps of a class that inherited from X, provided the language was
willing to unify the subtype and subclass hierarchies).  The
programmer of the "X::copyFrom(anX)" method might express this using
some syntax such as the following pseudo-dynamically-typed-OOPL (the
"of_class X" part of the parameter list tells the compiler to generate
a dynamic test to verify that anX is actually an object of class X):

	X::copyFrom(anX of_class X)
	begin
	  self.priv := anX.priv;
	    //Assuming "priv" is a private data member of X objects,
	    //the above assignment would be ok because the compiler
	    //now knows that "anX" is an X object.
	end

Please note that there is a tradeoff here.  The advantage of per-class
encapsulation would be that class X wouldn't necessarily have to
export its "priv" data member via a getPriv method.  The disadvantage
would be a loss of generality/reusability: the only objects that could
be passed to the copyFrom method would be those of class X (rather
than the more general category of "those with a getPriv method").

Grand conclusion: Per-class encapsulation allows the programmer to
make the tradeoff between how well an object encapsulates its internal
mechanisms and how reusable the methods are with respect to the
allowed/supported types of parameter objects.  Some people believe all
tradeoffs should be made by the programming language (facism), others
believe all tradeoffs should be made by the programmer (anarchism).
Facists claim that *their* choice of tradeoffs is "The Right" choice.
Anarchists claim that all programmers are wise enough to make the
right choices in all situations.  Clearly the truth lies somewhere
between these two extremes.

Statically typed OO languages don't allow an arbitrary object to be
passed to the copyFrom method, so the loss of generality/reusability
is inherent in the choice of static typing over dyanamic typing.
Therefore, with nothing else to lose, statically typed OO languages
generally prefer per-class encapsulation over per-object
encapsulation.