Objects First |
![]() Stages in compiling x.c (Filenames are those used by Unix compilers) |
Module | Purpose |
|---|---|---|
| Pre-processor | Pre-process text for input to the compiler | |
| Compiler | Compile pre-processed text into intermediate code | |
| Code generator | Generate assembler code for a target machine | |
| Assembler | Compile assembler code into object modules | |
| Linker | Link object modules into executable code | |
| Many variations of this scheme are possible and will be found in existing compilers. | ||
In fact, a lot of C's early popularity may have derived from this structure, as the first two programs were written in C and were common to all machines. Generating a new compiler for a new architecture was simply a matter of writing new code generators, assemblers and linkers. This speeds up the process of generating a compiler for a new architecture considerably and meant that C compilers became quickly available for all new architectures.
Unix compilers (as well as some other commonly used compilers, eg the Open Software Foundation's gnu compiler, gcc) still use separate programs for the various phases. However, other modern GUI-based development systems bundle the various compiler modules differently - or into one large monolithic program - thereby gaining some small speed increases and reducing programmers' caffeine intakes by not providing as many excuses for coffee breaks.
Whatever their actual structure, all C compilers provide the first two phases in a logical sense, ie the result of your compilation is as if your program has been passed first through the pre-processor and then through the compiler itself.
There are two variants of #include:
#include <stdio.h>
#include "abc.h"
The first, in which the file name appears in angle brackets,
< > is generally used for "standard" include
files, eg specification files for ANSI library
functions, such as
stdio.h, stdlib.h, time.h, etc.
It looks for the file in a standard list of directories:
this list is set for your compiler and may vary from
system to system:
on Unix systems, it will usually be set to
/usr/include.
On other systems, the list of directories to be searched
will usually be set when the compiler is installed.
However, it's generally possible to edit it, so that, for example,
you can add your own libraries of functions or classes to those
required by the ANSI standard.
A little bit of searching through the various option menus will
usually uncover the list for your system.
In the second form, in which the filename appears in quotes,
" ", the local directory is searched for the file.
You would use this form for including specification files for
classes that you have written for this program.
The file name is usually interpreted according to rules for the
host operating system, so that
#include "/u/userx/classZ.h"
will be acceptable to a Unix operating system and
#include "C:classZ.h"
will be acceptable to lesser systems.
#define EPSILON 1.0e-5
#define MAX_COUNT 500
#define PRINT_LINE printf("----------");
#define LONG_SUBSTITION if(a>b){ printf("a>b"); } \
else { prinf("a<=b"); }
Note the use of the back-slash (\) to extend the text string that
will be substituted over multiple lines.
As the examples show, #define can be used to define
arbitrary program fragments in addition to its normal use
for defining constants.
#define PRINT(x) printf("#x = %d",x)
#define SQR(x) (x*x)
#define CUBE(x) (x*x*x)
#define PER_CENT(x,y) (x*100.0/y)
The pre-processor takes the actual arguments to the macro and substitutes
them for the formals as it expands the macro.
Thus:
| becomes | |
|---|---|
| SQR(z) | (z*z) |
| PRINT(max) | printf("max = %d",max) |
| PER_CENT(p.q) | (p*100.0/q) |
| PER_CENT(a+b,c+d) | (a+b*100.0/c+d) |
| SQR(a/b) | (a/b*a/b) |
In order to obtain the desired results reliably,
macro arguments which form part of expressions should be
enclosed in parentheses.
#define PRINT(x) printf("#x = %d",(x))
#define SQR(x) ((x)*(x))
#define CUBE(x) ((x)*(x)*(x))
#define PER_CENT(x,y) ((x)*100.0/(y))
Now the expansions produce:
| becomes | |
|---|---|
| SQR(z) | ((z)*(z)) |
| PRINT(max) | printf("max = %d",(max)) |
| PER_CENT(p.q) | ((p)*100.0/(q)) |
| PER_CENT(a+b,c+d) | ((a+b)*100.0/(c+d)) |
| SQR(a/b) | ((a/b)*(a/b)) |
Symbols within macros are expanded by the pre-processor also.
So that if I want to assign symbolic names to a set of sequential
integers, eg the names for states of a state machine,
then I can write:
#define RESET 0
#define IDLE (RESET+1)
#define MEM_WAIT (IDLE+1)
#define MEM_READ (MEM_WAIT+1)
....
The expansion produced by the pre-processor for
MEM_READ is
(((0+1)+1)+1)
which gives us the desired value of 3.
The reason for doing this, rather than explicitly assigning 0, 1, 2, ..
to the various states, is that it
makes maintenance of the program easier.
To insert a new state into the middle of the sequence,
the new line is simply added and the one following it changed.
If I had explicitly numbered them, I would have had to
change all the following lines!
Of course, I could also have used an
enum and saved myself all the bother,
but, unfortunately, there are occasions when the actual
values need to be defined in some sequence!
#undef SYMBOL_X #define SYMBOL_X NEW_VALUE_FOR_XIf the #undef is omitted, a C compiler will emit an error like
#ifdef TRACE_MODE
printf("x = %d, y = %d, z = %d, q = %d\n", x, y, z, q );
#endif
Another common use is to prepare programs for a number of
target architectures:
#ifdef ALPHA
typedef long int64;
typedef short int32;
typedef struct { char msb, lsb; } int16;
#else
#ifdef SUN
typedef long long int64;
typedef int int32;
typedef short int int16;
#else
typedef struct {int msw, lsw; } int64;
typedef int int32;
typedef short int int16;
#endif
#endif
#endif
which shows a possible way of writing portable code
to deal with the differing interpretations of
int
on a variety of machines.
When the word length is critical,
int64, int32 or int16
are used rather than
long, int or short.
Note that #ifdef, #ifndef and #else can be nested if needed. Don't forget to include the right number of terminating #endif's - one for every #ifdef or #ifndef.
#ifndef X can be read "if X is not defined .." and can be followed by #else if needed.
Key terms |
|
Continue on to Class Design Review - Step by Step Back to the Table of Contents |