Objects First


Choosing the Test Data

Poorly verified programs are bad for your health!
They don't understand when it's convenient to fail!

For the withdraw method on our savings account class, we made two tests: one for the case where the account had sufficient funds and the other for the inadequate funds case. Of course, we could have tested many other cases, but a simple analysis of the function shows that these two tests are adequate and that we don't need to perform any other tests.

Path analysis

Look at the code of the Withdraw method:
double Withdraw( Account a, double amt ) {
  assert( a != NULL );
  assert( amt >= 0.0 );
  if ( amt <= a->balance ) {
    a->balance = a->balance - amt;
    return amt;
    }
  else {
    printf("Insufficient funds\n");
    return 0.0;
    }
  }
We can see that the execution path through the program can take exactly two paths:

A Minimum Testing Criterion

One useful minimum criterion for testing a function is that each branch of the function must be taken at least once in the testing procedure. This means that in branching statements - if and switch - each alternative set of statements: must be chosen by at least one test.

This criterion is obviously necessary for complete testing: if it's not met then there are program statements which have never been executed which could have errors.

Note that you must include branches that process errors also!

The behaviour of a function when it gets erroneous data is just as important as the the behaviour for correct data.

Our bank will certainly want to be sure that a depositor cannot withdraw more than his current balance! The behaviour when erroneous data is presented to a function will (or should!) be precisely stated in the specifications. The programmer will make decisions as to where errors are detected (ie whether they are part of a function's pre-conditions or its post-conditions, but the tester will ensure that erroneous data is included in the tests.

Errors arise from many sources, from human error at input to faults generated by failure of equipment (disc crashes, etc) and disturbances (noise on communication lines). Failure to test the error handling code will generally leave a large amount of untested, yet usually vital, code. In general information processing systems, it is not unusual for more than half the code to be involved with catching and processing errors of some kind or other!

Test Coverage Tools

Automated tools exist that analyse your program and insert extra statements that count the number of times each statement of your program is executed.

These tools may be used to identify statements which are never executed. Tests are added to the test set until the tool reports that every statement is executed at least once. Thus they provide a useful automated confirmation that a minimum set of tests has been devised. In particular, they would verify that all error catching and processing code has been exercised.

The Unix graphical profiler, gprof, is one example of such a tool.

Sometimes these tools are called profilers because they are used in optimisation exercises to determine which sections of a program are exercised most often and are therefore likely to produce the most reward for effort in optimisation.

It is unfortunate that performance seems to be more highly regarded than correctness. It is completely obvious that a program that produces the wrong answer quickly is of less use than one that produces a correct answer slowly!

Multiple Paths

Unfortunately real programs generally have complex branching structures which result in large numbers of paths through any section of code.
int f( int a ) {
  if( a < b ) {
    a = procA( a );
    }
  else {
    a = procB( a, b );
    }
  if( a > c ) {
    procC( a, c );
    }
  else {
    procD( a, b );
    }
  return a;
  }
The red and green arrows show two paths which test all the statements in this simple piece of code.

But are they enough?

No! This simple fragment of code has 4 paths through it! Strictly, each path should be tested since the value of a may be changed in the first if .. else statement and thus affect the path chosen through the second if .. else statement.

It is easy to see that for even a relatively simple function, just a few branches soon lead to a large number of possible paths!

If the program has k branches (if or switch) each with ni branches (ni >= 2), then the total number of paths may be
n1 n2 .. nk >= 2 k.

Thus to fully test f(a), we must take at least 4 values of a and call f(a) with them. (Note the italicised at least - because we can't see the code for procA, procB, .. which may require further values of a.) We choose these values to ensure that the four paths through f(a) are all checked.

A key point here is that we choose the test values of a carefully to ensure that all paths through the function are exercised. We do not choose random values of a: it would be easy to choose 4 values which just took the program along the red path, leaving 3 timebombs for us.

Equivalence Classes

Let's look at another simple example: the function which converts lower case characters to upper case ones - toupper. An implementation of toupper might look like this:
char toupper( char c ) {
  if ( (c>'a') || (c<'z') ) {
    return c - 'a' + 'A';
  else
    return c;
  }
The possible inputs to this function as the 256 characters which can be coded in 8 bits: if we lay them out in order according to the ASCII coding scheme they look something like this:

By looking at this diagram, we can see that the input divides itself into 3 regions: characters less than 'a', characters in the range 'a' - 'z' and characters greater than 'z'. We would get the same analysis by looking at the program - there is one if .. else .. branch and the condition that determines entry to the TRUE branch (if (cond) .. branch) has two parts to it for a total of 3 cases - two conditions which will generate TRUE and one (all the rest) that generates FALSE.

This suggests that we need to perform 3 tests to verify this function: choosing a value in the blue region, one in the green region and a further one in the magenta region. In each region, it doesn't matter which character we choose as a representative of its region - all characters in the region are equivalent in the sense that they're treated identically by the function toupper. Formally, we say that we've divided the input data into equivalence classes - sets of data which are equivalent as far as the function we're testing is concerned.

Thus, we choose 3 characters, say '0', 'f' and '|', to test our function. We make a little test program:

int test_toupper() {
  int i;
  struct { char test, result; } tests[] =
                { {'0','0'}, {'f','F'}, {'|','|'} };
#define N_tests	sizeof(tests)
  for(i=0;i<N_tests;i++) {
    if( toupper(tests[i].test) != tests[i].result ) return FALSE;
    }
  return TRUE;
  }
We run this program and discover that all characters are converted, ie the '0' and '|' tests fail. Examination of toupper shows that we wrote || instead of &&, so our 3 test cases were sufficient to find the error. We correct it and run the test program again. It now processes our 3 test cases correctly, so we are satisfied that it will process all others correctly.

Or are we?

Go back and look at the code carefully. A more careful examination suggests that this would be better:
char toupper( char c ) {
  if ( (c>='a') && (c<='z') ) {
    return c - 'a' + 'A';
  else
    return c;
  }

What happened?

We made a very common programming mistake: typing > instead of >=. Something which is very easy to overlook, isn't it?

To rectify this, experience teaches us that we should also include boundary values in our test data. Luckily, we've structured our test program so that some simple changes will fix the test routine and catch the error in the original code:

int test_toupper() {
  int i;
  struct { char test, result; } tests[] =
                { {'0','0'}, {'a','A'}, {'f','F'}, {'z','Z'},
                {'|','|'} };
#define N_tests	sizeof(tests)
  for(i=0;i<N_tests;i++) {
    if( toupper(tests[i].test) != tests[i].result ) return FALSE;
    }
  return TRUE;
  }
This will catch the original error. But what if we had originally written:
char toupper( char c ) {
  if ( (c>=''') && (c>='{') ) {
    return c - 'a' + 'A';
  else
    return c;
  }
where we've just used the values outside the lower case letters as the test points? Having been caught once with this problem, our really experienced software engineer now includes both boundary values in the test cases:
  struct { char test, result; } tests[] =
                { {'0','0'}, {'\'','\''}, {'a','A'}, {'f','F'},
                {'z','Z'}, {'{','{'},
                {'|','|'} };

General procedure

  1. For each input,
  2. Write a test program which applies this set of values to the function under test.
If the function you are testing has n inputs, then the set of test values is the cartesian product of the sets of test values chosen for each individual input. While this can readily turn out to be a very large set, it is trivially handled by a set of nested loops in C (or any other high level language).

Assume we have a function, f(a,b,c) of three values and that we have identified N_a_test_values, a[i] for a, N_b_test_values, b[i] for b, etc and typed up a (possibly quite large) set of expected values, then the following code will mechanically perform

N_a_test_values* N_b_test_values* N_c_test_values

tests for us each time we change f(a,b,c) and need to verify it again:
  ok = TRUE;
  for(i=0;i<N_a_test_values;i++) {
    for(j=0;j<N_b_test_values;j++) {
      for(k=0;k<N_c_test_values;k++) {
        if( f(a[i],b[j],c[k]) != expected_value[i][j][k] ) {
          ok = FALSE;
          break;
          }
        }
      }
    }

Key terms

Test Coverage Tools
Tools which count the number of times of a program has executed each statement of a program. Primarily these tools can be used to check that every statement has been executed at least once.
Profilers
Profilers count the number of times each statement (or region of a program, such as a function) has been executed. Although these tools are primarily designed to assist the optimisation of programs (by identifying frequently used parts of a program which will benefit most from optimisation), they can be used in the same way as coverage tools to ensure that all statements in a program have been executed.
Path Analysis
producing a computer program rapidly, without thought and
Equivalence Classes
Equivalence Classes

Continue on to Special Values Back to the Table of Contents
© John Morris, 1998