Objects First


Class Design - Reprise

In this chapter, we review the process of designing a class and set out a step-by-step "recipe" for building classes. We also add pre- and post-conditions to method specifications.

Class Design - Step by Step

  1. User's Requirements

    Before you start, you will need to obtain from the potential users of the class a set of rules about the behaviour of objects of the class. Alternatively, if you're designing a large system and the user has only specified the behaviour of some large, complex objects within the system and you have decided to model these objects as aggregates or hierarchies of simpler objects, then you will need to set out a clear statement of the desired behaviour of each small object.

  2. Specification First!

    Don't write any code until you've designed the class!

    Design at this stage means "create a formal software specification".

    Decide on a name for the class, say Thing, and create a specification file Thing.h.

    1. Name

      Name the class: use C's typdef to name a new C type. The name of the type should be the same as the class.

    2. Methods

      Specify the methods.
      1. Start with the constructor. For each method:
        1. Decide what it will return.
        2. Name it
        3. Decide what it's parameters will be.
        4. Specify pre-conditions.
        5. Specify post-conditions.
      2. Specify the remaining methods.
        Use the same steps as for the constructor. As parameters, each method will have at least the object that the method is manipulating.

  3. Brick Wall

    Note there's a big brick wall here! This is to emphasise the importance of

    1. Writing the specification first, then

    2. Checking that the specification is correct It must match the users' specifications in every detail: nothing that the user required must be missing and there must be nothing in the software specification that violates the users' requirements.

      A good strategy is to take the users' requirements and go mechanically through them, checking off each one as you confirm that the requirement has been met by the software model that you've specified.

      Violations of the users' requirements

      Take particular care here: a simple trap is to mechanically generate methods that project and update every attribute of a class. While the projectors will not violate users' requirements - they only supply information about the state of an object - the update methods may well do so because they change the state of an object. Often the behaviour rules will prohibit direct changes of certain attributes, either because rules about the validity of the changes must be checked, or because when one attribute is changed, another must be changed in a way that is consistent with some behaviour rules. Some concrete examples of this are found later in this section.

    Only when we're satisfied with the specification ..

  4. Implementation Last!

    1. Import
      • library classes, such as stdlib.h, stdio.h, assert.h, then
      • classes of your own on which this class depends.
    2. List the attributes of the class: put them in a C struct.
    3. Import the specification: make the compiler tell you when there is some inconsistency!
    4. Code the methods
      1. Pre-conditions first:
        Each pre-condition is turned into an invocation of an assert function.
      2. Then the code of the method.

Further Notes

Specification - Methods

We generally have three groups of methods:
  1. constructor and destructor,
  2. attribute extractors and
  3. update operations.

Constructor

Constructors allocate resources and set initial attributes for an object. The resources are usually memory space but may also include file space, communications channels, access to I/O devices, etc.

Usually the problem specification will tell you what the initial attributes should be. It will contain statements like:

The size and mass of a thing are known when it is created. Size refers to the maximum dimension in any direction; it's the minimum size of a circular opening through which a thing will pass. Things always have positive size and mass. Initially it will be in an undamaged state.
This tells you that the size and mass should be parameters of the constructor. It also tells you that there is some state of a Thing which will take values like undamaged, damaged, ... .

Thus the constructor will have the specification:

Thing ConsThing( double size, double mass );
/*
  Pre-condition: (size > 0) && (mass > 0)
*/
and in the implementation, the initial state will be set to undamaged:
Thing ConsThing( double size, double mass ) {
  Thing t;
  assert( size > 0 );
  assert( mass > 0 );
  t = (Thing)malloc( sizeof(struct thing) );
  if ( t != NULL ) {
    t->state = undamaged;
    t->size = size;
    t->mass = mass;
  ...
  return t;
  }
Other states like lost, destroyed, .. will be described elsewhere in the specification. It's your job as the class designer to read the problem specification and determine all the states that a Thing can be in!
Thus determining the arguments of the constructor is generally a matter of dividing the attributes of objects into two sets:
a. those which are specified when an object is constructed

and

These become the formal parameters of the constructor.
b. those which assume some specified default value. These are set to their default values inside the constructor.
Checkpoint
When the constructor returns, each attribute of the newly created object should have a defined value. The values of the attributes define the state of the object, so if any attributes are not defined at this point, the object's state is not defined.

At every point in a program, you should be able to make clear statements about the state of every object.

Multiple constructors

In our real or virtual world, an object may be able to be constructed in many ways. Thus we must allow a class to have multiple constructors. We might be able to construct an object from another object, so that our new object acquires all its attribute values from its "parents". The simplest example of this type of constructor is a "Copy" method, which creates a new object which is a clone of an existing one:
Thing CloneThing( Thing t );
/* Pre-condition: t is a valid Thing
   Post-condition: returned Thing is a clone of t
*/
In software specifications, we might want to specify sets of constructors in which different sets of attributes take default values and need to be specified at construction. For example, our factory's production line occasionally makes damaged Thing's which we send on to a recovery section, so we add another constructor:
Thing ConsThingDamaged( double size, double mass );
which is the same as ConsThing except that the initial state is set to damaged. Alternatively, we could allow the initial state to be set on construction:
Thing ConsThingWithState( double size, double mass, ThingState s );
Either of these approaches is acceptable: the better solution will depend on the problem and there is the possibility that there are conflicting, but equal weight, arguments in favour of both!

Aside
There is a certain amount of "art" in good software design.

There's generally no "best" in software design.

However, there are plenty of "don't"s!

Generally, it's possible to identify practices which are clearly good or bad.

Apply criteria like:

  • Would the reader be able to understand the design
    • quickly,
    • unambiguously?
  • Can I change the software
    • efficiently ie will a change one place require multiple additional changes?
    • quickly ie is the design clear enough to allow me to find the section that needs change efficiently?
  • Are the software modules easy to read?
    • Are all the names self-explanatory?
    • Is the naming system consistent?
There is an enormous grey area of acceptable practices and it's generally impossible to label one approach "best". If you can meet all the criteria listed, then your design is acceptable, even if someone may be able to later convince you that there's a slightly better way!

Attribute Extractors

We usually need to be able to query an object to determine values of its attributes. So class designs will a group of methods which simply return values of the object's attributes.
double ThingMass( Thing t );
double ThingSize( Thing t );
ThingState CurrentState( Thing t );
Most of these are simple methods, consisting perhaps of no more than an assertion and a return statement:
double ThingMass( Thing t ) {
  assert( t != NULL );
  return t->mass;
  }
However, remember that these methods are important in hiding details of our class' implementation from its users. By hiding these details, it allows us to change the internal attributes as needs change without affecting programs that manipulate a class.

For example, suppose that our needs expand and we have to expand the Thing class to include volume and density attributes. Since the mass, volume and density are inter-related, we only need to store two of the three. Suppose we decided to store volume and density: this means that the ThingMass method needs to be changed:

double ThingMass( Thing t ) {
  assert( t != NULL );
  return t->volume * t->density;
  }
but the code that uses or manipulates Things needs no changes! This is an important contribution to reducing the cost of maintaining software and increasing its reliability. Fewer changes mean fewer opportunities for error!

ThingState?

Note that our example included a type named ThingState. This labels the state of our things as damaged, undamaged, lost, .. and would ideally be implemented as an enumerated type:
typedef enum (undamaged, damaged, lost, .. ) ThingState;
But where do we define this type?

Formally, it should be considered a new class and have a pair of specification and implementation files (ThingState.h, ThingState.c) to itself. However, practically, ThingState is never used without Thing, so it seems more reasonable to define this additional type in the specification for Thing, ie add it to Thing.h:

/* Thing.h - Specification for class: Thing */

typedef enum (undamaged, damaged, lost, .. ) ThingState;

typedef struct thing *Thing;

Thing ConsThing( .. );

...
This departure from the formal steps for building a class is justified by efficiency and compactness considerations: it doesn't make sense to create two - essentially trivial - files to define the ThingState class when a one line addition to the Thing.h file will neatly solve the problem! (However, a formal approach with ThingState in a separate file could not be considered wrong. Experience will soon teach you when departure from the rules makes sense (usually because it makes it easier to understand a program) and when it will make your program more complex and difficult to work with.)

Update Methods

In this group we include all methods which change the value of an object's attributes in some way. Your application may need a simple set of Set.. methods which change the values of individual attributes:
void SetMass( Thing t, double new_mass );
void SetSize( Thing t, double new_size );
These methods have straight-forward implementations, eg
void SetMass( Thing t, double new_mass ) {
  assert( t != NULL );
  assert( new_mass > 0 );
  t->mass = new_mass;
  }

Enforcing Behavioural Rules

The specification will contain rules about the behaviour of objects in your class. Some of these rules will imply that certain attributes are To ensure that your objects conform to these behavioural rules, you will deliberately omit methods that update some attributes directly. For example, our Thing class may have a "Conservation of Mass" rule which says that our Thing's may not change their mass, unless they are involved in a collision with another Thing (see example
later). This implies that a SetMass method should not be provided because it would allow a programmer to violate a basic rule for the class. Alternatively, there might be a rule that says Thing's only change mass when they're heated above a certain temperature (and bits start to crack off). In this case also, the SetMass method is not provided and one called ThermalLoss is provided instead:
double ThermalLoss( Thing t, double temperature );
/* Heat a thing to temperature and calculate the mass lost through thermal
   fracturing of the surface
  Pre-cond: t is a valid Thing
            temperature > 0.0
  Post-cond: Returns mass lost
             (mass lost + ThingMass(t)) = old ThingMass(t)
*/
Observe the post-condition here: this is an example of a really useful post-condition - one that checks that the method has not violated some basic rule - in this case, the physical conservation of mass law.

Complex Update Methods

It may also need more complex methods which change more than one attribute:
void HitThing( Thing t, double force );
/* Pre-condition: t is a valid Thing,
                  force >= 0
   Post-condition: if ( force < damage_threshold ) t is unchanged,
                   else {
                       a piece whose size is proportional
                                  to force is broken off
                       (mass and size changed appropriately) and
                       t's state is changed to damaged.
                       }
*/

Methods - General

Note that, except for the constructor, a method operates on an object of a class. Thus it must have, as one of its arguments, an object of the class. It's probably a good convention to make this the first argument.
void HitThing( Thing t, double force );

However, we can easily imagine much more complex methods:


int ThingsColliding( Thing a, double vel_a, Thing b, double vel_b,
                     Thing chips[], double chip_vel[] );
/* Things a and b collide and produce an
   array of chips
   Pre-condition:
         (a != NULL) && (b != NULL) &&
                  (vel_a >= 0) && (vel_b >= 0)
   Post-condition:
         if the collision produces any chips,
         this method creates an array of new Things - chips
         and an array of velocities, chip_vel.
         The number of chips produced is returned as the
         function value
*/
In this method, two objects interact with each other and create a set of new objects. This method will update a and b, as well as calling the Thing constructor to create the "chips".

Dependent Classes

Classes often depend on other classes, eg a point in two-dimensional space is composed of two doubles, a rectangle is defined in terms of two points and a map of a building (used by a robot for navigating around it) may consist of arrays of rectangles defining areas which it cannot enter. Classes can also be defined which are specialisations of other classes, eg we could define a class of geometric Shapes which have attributes such as position, colour, etc. This class can be specialised into circles, rectangles (which are in turn specialised into squares) and so on. Methods which operate on Shapes, eg Colour( s ), Position( s ) also operate on circles, rectangles and squares. In the formal theory of object oriented design, this is known as inheritance. We will not attempt to treat these dependencies and inheritance here with any rigour. It will suffice to note that in the specification of a class, we will often need to refer to another class. For example, a constructor that constructs objects by reading data from a text file:
Thing ReadThing( FILE *f );
/* Pre-condition: f is open
*/
depends on the stream class which defines FILE. So we must include its specification in the specification for Things:
#include <stdio.h>
....
Thing ReadThing( FILE *f );
/* Pre-condition: f is open
*/
However, if the implementation depends on other classes, then import them into the implementation file.
Specifications should be imported on a "need-to-know" basis.
For example, the calculations for ThingsColliding will certainly need the standard mathematical function library, <math.h>. However, functions which call ThingsColliding probably don't need to do such calculations so shouldn't be burdened unnecessarily with <math.h>.
Spelling Mistakes
The "need-to-know" inclusion principle is a good strategy for catching careless spelling mistakes in function names. An ANSI C compiler will produce a warning if it can't see a prototype for a function. Suppose that you intended to call a function, sign( x ). You accidentally type sin( x ). Because you've unnecessarily added #include <math.h> to your file, the compiler doesn't pick up your error and the program compiles, links and runs to completion. However the output is rubbish and you're now faced with the problem of tracing back through the whole program to find a single letter typing mistake!


Implementation

Attributes

An important design decision in building the implementation is to name and set types for the attributes of an object. The attributes which will be needed can be determined by examining the description of the problem: it will contain information about the properties of objects which need to be modelled. Each distinct property will need to be included as a class attribute. Sometimes it will be obvious: initially our Things had size and mass, so we included attributes with type double (float would have been fine if the problem's precision was low). We also noted the need to classify objects as damaged, etc, so we added an attribute state of the enumerated type ThingState.
struct thing {
  double mass, size;
  ThingState state;
  };
Later, we discovered that we needed volume and density too, so we decided to replace mass with volume and density.
struct thing {
  double volume, density, size;
  ThingState state;
  };
(Alternatively, we could have simply added volume and calculated density as we needed it.)

Again we note that we can extend the capabilities of the class to model real world objects in this way without affecting code which only needed to deal with simpler Things. We leave all the original methods (making changes only where we have no choice) and simply add some new ones to model the new capabilities. Since the majority of maintenance of software is probably the extension and refinement of capabilities, this design strategy, which permits tested systems to be gradually improved or refined in a way which doesn't affect their original properties, contributes significantly to the reliability of software systems.

Pre-conditions

Pre-conditions in the specifications are transferred to the implementation as assert statements. For example:
double ThingMass( Thing t ) {
  assert( t != NULL );
  return t->mass;
  }

int ThingsColliding( Thing a, double vel_a, Thing b, double vel_b,
                     Thing chips[], double chip_vel[] ) {
  ..
  assert(a != NULL);
  assert(b != NULL);
  assert(vel_a >= 0);
  assert(vel_b >= 0);
  ...
Note that every method (other than constructors) should routinely check that it has not been passed a NULL pointer as the handle of the object on which it operates. In addition, all other method parameters should generally be checked for legal or reasonable values. For example, negative velocities are not allowed. This is a clear legal value constraint, another one is that velocities must be less than 3x108ms-1! We might also like to add a reasonableness constraint, such as:
  assert( vel_a < 20.0 );  /* Max reasonable velocity is 20m/s */
The major value of these assertions is that they "catch" errors made in other parts of the program, eg a user typed in an extra 0 when entering a velocity and the programmer responsible for the code reading the input forgot to add code at the input point to request a better value from the user.

The extra typing needed to add all the assert() statements may seem like a lot of unnecessary work, but my experience has shown that it is well worth it. Whenever an assertion is raised, it is extremely simple to trace back from the point at which the assertion was raised to the real source of the error. This is much quicker than trying to infer the source of the error from some erroneous output from a program.