This chapter will introduce enough
C++ syntax and program construction concepts to allow you to write
and run some simple object-oriented
programs. In the subsequent chapter we will cover the basic syntax of C and C++
in detail.
By reading this chapter first
you’ll get the basic flavor of what it is like to program with objects in
C++
and you’ll also discover some of the reasons for the enthusiasm
surrounding this language. This should be enough to carry you through Chapter 3
which can be a bit exhausting since it contains most of the details of the C
language.
The user-defined data
type
or
class
is what distinguishes C++ from traditional
procedural languages. A class is a new data type that you or someone else
creates to solve a particular kind of problem. Once a class is created
anyone
can use it without knowing the specifics of how it works
or even how classes
are built. This chapter treats classes as if they are just another built-in data
type available for use in programs.
Classes that someone else has created are
typically packaged into a library. This chapter uses
several of the class libraries that come with all C++ implementations. An
especially important standard library is iostreams
which (among other things)
allow you to read from files and the keyboard
and to write to files and the
display. You’ll also see the very handy string class
and the
vector container from the Standard C++ Library. By the end of the
chapter
you’ll see how easy it is to use a pre-defined library of
classes.
All computer languages are translated
from something that tends to be easy for a human to understand (source
code) into something that is executed on a computer (machine
instructions). Traditionally
translators fall into
two classes: interpreters and
compilers.
An interpreter translates source code
into activities (which may comprise groups of machine instructions) and
immediately executes those activities. BASIC
for
example
has been a popular interpreted language. Traditional BASIC interpreters
translate and execute one line at a time
and then forget that the line has been
translated. This makes them slow
since they must re-translate any repeated
code. BASIC has also been compiled
for speed. More modern interpreters
such as
those for the Python language
translate the entire
program into an intermediate language that is then executed by a much faster
interpreter[25].
Interpreters have many advantages. The
transition from writing code to executing code is almost immediate
and the
source code is always available so the interpreter can be much more specific
when an error occurs. The benefits often cited for interpreters are ease of
interaction and rapid development (but not necessarily execution) of
programs.
Interpreted languages often have severe
limitations when building large projects (Python seems to be an exception to
this). The interpreter (or a reduced version) must always be in memory to
execute the code
and even the fastest interpreter may introduce unacceptable
speed restrictions. Most interpreters require that the complete source code be
brought into the interpreter all at once. Not only does this introduce a space
limitation
it can also cause more difficult bugs if the language doesn’t
provide facilities to localize the effect of different pieces of
code.
A compiler translates source code
directly into assembly language or machine instructions. The eventual end
product is a file or files containing machine code. This is an involved process
and usually takes several steps. The transition from writing code to executing
code is significantly longer with a compiler.
Depending on the acumen of the compiler
writer
programs generated by a compiler tend to require much less space to run
and they run much more quickly. Although size and speed are probably the most
often cited reasons for using a compiler
in many situations they aren’t
the most important reasons. Some languages (such as C) are designed to allow
pieces of a program to be compiled independently. These pieces are eventually
combined into a final executable program by a tool called the
linker. This process is called separate
compilation.
Separate compilation has many benefits. A
program that
taken all at once
would exceed the limits of the compiler or the
compiling environment can be compiled in pieces. Programs can be built and
tested one piece at a time. Once a piece is working
it can be saved and treated
as a building block. Collections of tested and working pieces can be combined
into libraries for use by other programmers. As
each piece is created
the complexity of the other pieces is hidden. All these
features support the creation of large
programs[26].
Compiler debugging
features have improved significantly over time. Early compilers only generated
machine code
and the programmer inserted print statements to see what was going
on. This is not always effective. Modern compilers can insert information about
the source code into the executable program. This information is used by
powerful source-level debuggers to show exactly
what is happening in a program by tracing its progress through the source
code.
Some compilers tackle the
compilation-speed problem by performing in-memory
compilation. Most compilers work with files
reading and writing them in
each step of the compilation process. In-memory compilers keep the compiler
program in RAM. For small programs
this can seem as responsive as an
interpreter.
To program in C and C++
you need to
understand the steps and tools in the compilation process. Some languages (C and
C++
in particular) start compilation by running a
preprocessor on the source code. The preprocessor
is a simple program that replaces patterns in the source code with other
patterns the programmer has defined (using preprocessor
directives). Preprocessor directives are used to save
typing and to increase the readability of the code. (Later in the book
you’ll learn how the design of C++ is meant to discourage much of the use
of the preprocessor
since it can cause subtle bugs.) The pre-processed code is
often written to an intermediate file.
Compilers usually do their work in two
passes. The first pass parses the pre-processed
code. The compiler breaks the source code into small units and organizes it into
a structure called a tree. In the expression
“A + B” the elements ‘A’
‘+
’ and ‘B’ are leaves on the parse
tree.
A global
optimizer is sometimes used between the first and
second passes to produce smaller
faster code.
In the second pass
the code
generator walks through the parse tree and generates
either assembly language code or machine code for the nodes of the tree. If the
code generator creates assembly code
the assembler must then be run. The end
result in both cases is an object module (a file that
typically has an extension of .o or .obj). A peephole
optimizer is sometimes used in the second pass to
look for pieces of code containing redundant assembly-language
statements.
The use of the word
“object” to describe chunks of machine code
is an unfortunate artifact. The word came into use before object-oriented
programming was in general use. “Object” is used in the same sense
as “goal” when discussing compilation
while in object-oriented
programming it means “a thing with boundaries.”
The linker
combines a list of object modules into an executable program that can be loaded
and run by the operating system. When a function in one object module makes a
reference to a function or variable in another object module
the linker
resolves these references; it makes sure that all the external functions and
data you claimed existed during compilation do exist. The
linker also adds a special object module to perform start-up
activities.
The linker can search through special
files called libraries in order to resolve all its references. A
library contains a collection of object modules in a
single file. A library is created and maintained by a program called a
librarian.
The compiler performs type
checking during the first pass. Type checking tests
for the proper use of arguments in functions and prevents many kinds of
programming errors. Since type checking occurs during compilation instead of
when the program is running
it is called static type checking.
Some object-oriented languages (notably
Java) perform some type checking at runtime (dynamic
type checking). If combined with static type checking
dynamic type checking is more powerful than static type
checking alone. However
it also adds overhead to program
execution.
C++ uses static type checking because the
language cannot assume any particular runtime support for bad operations. Static
type checking notifies the programmer about misuses of types during compilation
and thus maximizes execution speed. As you learn C++
you will see that most of
the language design decisions favor the same kind of high-speed
production-oriented programming the C language is famous for.
You can disable static type checking in
C++. You can also do your own dynamic type checking – you just need to
write the code.
Separate compilation is particularly
important when building large projects. In C and C++
a
program can be created in small
manageable
independently tested pieces. The
most fundamental tool for breaking a program up into pieces is the ability to
create named subroutines or subprograms. In C and C++
a subprogram is called a
function
and functions are the pieces of code
that can be placed in different files
enabling separate compilation. Put
another way
the function is the atomic unit of code
since you cannot have part
of a function in one file and another part in a different file; the entire
function must be placed in a single file (although files can and do contain more
than one function).
When you call a function
you typically
pass it some arguments
which are values you’d like the function to
work with during its execution. When the function is finished
you typically get
back a return value
a
value that the function hands back to you as a result. It’s also possible
to write functions that take no arguments and return no
values.
To create a program with multiple files
functions in one file must access functions and data in other files. When
compiling a file
the C or C++ compiler must know about the functions and data
in the other files
in particular their names and proper usage. The compiler
ensures that functions and data are used correctly. This process of
“telling the compiler” the names of external functions and data and
what they should look like is called declaration.
Once you declare a function or variable
the compiler knows how to check to make
sure it is used
properly.
It’s important to understand the
difference between declarations and
definitions because these terms will be used
precisely throughout the book. Essentially all C and C++ programs require
declarations. Before you can write your first program
you need to understand
the proper way to write a declaration.
A declaration introduces a name
– an identifier – to the compiler. It tells the compiler “This
function or this variable exists somewhere
and here is what it should look
like.” A definition
on the other hand
says: “Make this
variable here” or “Make this function here.” It allocates
storage for the name. This meaning works whether you’re talking about a
variable or a function; in either case
at the point of definition the compiler
allocates storage. For a variable
the compiler determines how big that variable
is and causes space to be generated in memory to hold the data for that
variable. For a function
the compiler generates code
which ends up occupying
storage in memory.
You can declare a variable or a function
in many different places
but there must be only one definition in C and C++
(this is sometimes called the ODR: one-definition
rule). When the linker is uniting all the object modules
it will usually
complain if it finds more than one definition for the same function or
variable.
A definition can also be a declaration.
If the compiler hasn’t seen the name x before and you define int
x;
the compiler sees the name as a declaration and allocates storage for it
all at once.
A function declaration in C and C++ gives
the function name
the argument types passed to the function
and the return
value of the function. For example
here is a declaration for a function called
func1( ) that takes two integer arguments (integers are denoted in
C/C++ with the keyword int) and returns an integer:
int func1(int int);
The first keyword you see is the return
value all by itself: int. The arguments are enclosed in parentheses after
the function name in the order they are used. The semicolon indicates the end of
a statement; in this case
it tells the compiler “that’s all –
there is no function definition here!”
C and C++ declarations attempt to mimic
the form of the item’s use. For example
if a is another integer
the above function might be used this way:
a = func1(2 3);
Since func1( ) returns an
integer
the C or C++ compiler will check the use of func1( ) to
make sure that a can accept the return value and that the arguments are
appropriate.
Arguments in
function declarations may have names. The compiler ignores the names but they
can be helpful as mnemonic devices for the user. For example
we can declare
func1( ) in a different fashion that has the same
meaning:
int func1(int length int width);
There is a significant difference between
C and C++ for functions with empty argument lists. In C
the
declaration:
int func2();
means “a function with any number
and type of argument.” This prevents type-checking
so in C++ it means “a function with no arguments.”
Function definitions look like function
declarations except that they have bodies. A body is a
collection of statements enclosed in braces. Braces denote the beginning and
ending of a block of code. To give func1( ) a definition that is an
empty body (a body containing no code)
write:
int func1(int length
int width) { }
Notice that in the function definition
the braces replace the semicolon. Since braces surround a statement or group of
statements
you don’t need a semicolon. Notice also that the arguments in
the function definition must have names if you want to use the arguments in the
function body (since they are never used here
they are
optional).
The meaning attributed to the phrase
“variable declaration” has historically been confusing and
contradictory
and it’s important that you understand the correct
definition so you can read code properly. A variable declaration tells the
compiler what a variable looks like. It says
“I know you haven’t
seen this name before
but I promise it exists someplace
and it’s a
variable of X type.”
In a function declaration
you give a
type (the return value)
the function name
the argument list
and a semicolon.
That’s enough for the compiler to figure out that it’s a declaration
and what the function should look like. By inference
a variable declaration
might be a type followed by a name. For example:
int a;
could declare the variable a as an
integer
using the logic above. Here’s the conflict: there is enough
information in the code above for the compiler to create space for an integer
called a
and that’s what happens. To resolve this dilemma
a
keyword was necessary for C and C++ to say “This is only a declaration;
it’s defined elsewhere.” The keyword is
extern. It can mean the
definition is external to the file
or that the definition occurs later
in the file.
Declaring a variable without defining it
means using the extern keyword before a description of the variable
like
this:
extern int a;
extern can also apply to function
declarations. For func1( )
it looks like this:
extern int func1(int length int width);
This statement is equivalent to the
previous func1( ) declarations. Since there is no function body
the
compiler must treat it as a function declaration rather than a function
definition. The extern keyword is thus superfluous and optional for
function declarations. It is probably unfortunate that the designers of C did
not require the use of extern for function declarations; it would have
been more consistent and less confusing (but would have required more typing
which probably explains the decision).
Here are some more examples of
declarations:
//: C02:Declare.cpp
// Declaration & definition examples
extern int i; // Declaration without definition
extern float f(float); // Function declaration
float b; // Declaration & definition
float f(float a) { // Definition
return a + 1.0;
}
int i; // Definition
int h(int x) { // Declaration & definition
return x + 1;
}
int main() {
b = 1.0;
i = 2;
f(b);
h(i);
} ///:~
In the function declarations
the
argument identifiers are optional. In the definitions
they are required (the
identifiers are required only in C
not C++).
Most libraries contain significant
numbers of functions and variables. To save work and ensure consistency when
making the external declarations for these items
C and C++ use a device called
the header file. A header file is a file
containing the external declarations for a library; it conventionally has a file
name extension of ‘h’
such as headerfile.h. (You may also
see some older code using different extensions
such as .hxx or
.hpp
but this is becoming rare.)
The programmer who creates the library
provides the header file. To declare the functions and external variables in the
library
the user simply includes the header file. To include a header file
use
the #include
preprocessor
directive. This tells the preprocessor to open the named header file and insert
its contents where the #include statement appears. A #include may
name a file in two ways: in angle brackets (< >) or in double
quotes.
File names in angle brackets
such
as:
#include <header>
cause the preprocessor to search for the
file in a way that is particular to your implementation
but typically
there’s some kind of “include search path” that you specify in
your environment or on the compiler command line. The mechanism for setting the
search path varies between machines
operating systems
and C++ implementations
and may require some investigation on your part.
File names in double quotes
such
as:
#include "local.h"
tell the preprocessor to search for the
file in (according to the specification) an “implementation-defined
way.” What this typically means is to search for the file relative to the
current directory. If the file is not found
then the include directive is
reprocessed as if it had angle brackets instead of quotes.
To include the iostream header file
you
write:
#include <iostream>
The preprocessor will find the iostream
header file (often in a subdirectory called “include”) and insert
it.
As C++ evolved
different compiler
vendors chose different extensions for file names. In addition
various
operating systems have different restrictions on file names
in particular on
name length. These issues caused source code portability problems. To smooth
over these rough edges
the standard uses a format that allows file names longer
than the notorious eight characters and eliminates the extension. For example
instead of the old style of including iostream.h
which looks like
this:
#include <iostream.h>
you can now write:
#include <iostream>
The translator can implement the include
statements in a way that suits the needs of that particular compiler and
operating system
if necessary truncating the name and adding an extension. Of
course
you can also copy the headers given you by your compiler vendor to ones
without extensions if you want to use this style before a vendor has provided
support for it.
The libraries that have been inherited
from C are still available with the traditional ‘.h’
extension. However
you can also use them with the more modern C++ include style
by prepending a “c” before the name. Thus:
#include <stdio.h> #include <stdlib.h>
become:
#include <cstdio> #include <cstdlib>
And so on
for all the Standard C
headers. This provides a nice distinction to the reader indicating when
you’re using C versus C++ libraries.
The effect of the new include format is
not identical to the old: using the .h gives you the older
non-template
version
and omitting the .h gives you the new templatized version.
You’ll usually have problems if you try to intermix the two forms in a
single
program.
The linker collects object modules (which
often use file name extensions like .o or .obj)
generated by the
compiler
into an executable program the operating system can load and run. It
is the last phase of the compilation process.
Linker characteristics vary from system
to system. In general
you just tell the linker the names of the object modules
and libraries you want linked together
and the name of the executable
and it
goes to work. Some systems require you to invoke the linker yourself. With most
C++ packages you invoke the linker through the C++ compiler. In many situations
the linker is invoked for you invisibly.
Some older linkers
won’t search object files
and libraries more than once
and they search through the list you give them
from left to right. This means that the order of object files and libraries can
be important. If you have a mysterious problem that doesn’t show up until
link time
one possibility is the order in which the files are given to the
linker.
Now that you know the basic terminology
you can understand how to use a library. To use a library:
These steps also
apply when the object modules aren’t combined into a library. Including a
header file and linking the object modules are the basic steps for separate
compilation in both C and C++.
When you make an external reference to a
function or variable in C or C++
the linker
upon encountering this reference
can do one of two things. If it has not already encountered the definition for
the function or variable
it adds the identifier to its list of
“unresolved
references.” If the linker
has already encountered the definition
the reference is
resolved.
If the linker cannot find the definition
in the list of object modules
it searches the libraries.
Libraries have some sort of indexing so the linker doesn’t need to look
through all the object modules in the library – it just looks in the
index. When the linker finds a definition in a library
the entire object
module
not just the function definition
is linked into the executable program.
Note that the whole library isn’t linked
just the object module in the
library that contains the definition you want (otherwise programs would be
unnecessarily large). If you want to minimize executable program size
you might
consider putting a single function in each source code file when you build your
own libraries. This requires more
editing[27]
but it can be helpful to the user.
Because the linker searches files in the
order you give them
you can pre-empt the use of a library function
by inserting a file with your own function
using the
same function name
into the list before the library name appears. Since the
linker will resolve any references to this function by using your function
before it searches the library
your function is used instead of the library
function. Note that this can also be a bug
and the kind of thing C++ namespaces
prevent.
When a C or C++ executable program is
created
certain items are secretly linked in. One of these is the startup
module
which contains initialization routines that must
be run any time a C or C++ program begins to execute. These routines set up the
stack and initialize certain variables in the program.
The linker always searches the standard
library for the compiled versions of any
“standard” functions called in the program. Because the standard
library is always searched
you can use anything in that library by simply
including the appropriate header file in your program; you don’t have to
tell it to search the standard library. The iostream functions
for example
are
in the Standard C++ library. To use them
you just include the
<iostream> header file.
If you are using an add-on library
you
must explicitly add the library name to the list of files handed to the
linker.
Just because you are writing code in C++
you are not prevented from using C library functions. In fact
the entire C
library is included by default into Standard C++. There has been a tremendous
amount of work done for you in these functions
so they can save you a lot of
time.
This book will use Standard C++ (and thus
also Standard C) library functions when convenient
but only standard
library functions will be used
to ensure the portability of programs. In the
few cases in which library functions must be used that are not in the C++
standard
all attempts will be made to use POSIX-compliant functions. POSIX is a
standard based on a Unix standardization effort that includes functions that go
beyond the scope of the C++ library. You can generally expect to find POSIX
functions on Unix (in particular
Linux) platforms
and often under DOS/Windows.
For example
if you’re using multithreading you are better off using the
POSIX thread library because your code will then be easier to understand
port
and maintain (and the POSIX thread library will usually just use the underlying
thread facilities of the operating system
if these are
provided).
You now know almost enough of the basics
to create and compile a program. The program will use the Standard C++ iostream
classes. These read from and write to files and “standard” input and
output (which normally comes from and goes to the console
but may be redirected
to files or devices). In this simple program
a stream object will be used to
print a message on the
screen.
To declare the functions and external
data in the iostreams class
include the header file with the
statement
#include <iostream>
The first program uses the concept of
standard output
which means
“a general-purpose place to send output.” You will see other
examples using standard output in different ways
but here it will just go to
the console. The iostream package automatically defines a variable (an object)
called cout that accepts all data bound for
standard output.
To send data to standard output
you use
the operator <<. C programmers know this operator as the
“bitwise left shift
” which will be described in the next chapter.
Suffice it to say that a bitwise left shift has nothing to do with output.
However
C++ allows operators to be overloaded. When you overload an
operator
you give it a new
meaning when that operator is used with an object of a particular type. With
iostream objects
the operator << means “send to.” For
example:
cout << "howdy!";
That’s enough operator overloading
to get you started. Chapter 12 covers operator overloading in
detail.
As mentioned in Chapter 1
one of the
problems encountered in the C language is that you “run out of
names” for functions and identifiers when your programs reach a certain
size. Of course
you don’t really run out of names; it does
however
become harder to think of new ones after awhile. More importantly
when a
program reaches a certain size it’s typically broken up into pieces
each
of which is built and maintained by a different person or group. Since C
effectively has a single arena where all the identifier and function names live
this means that all the developers must be careful not to accidentally use the
same names in situations where they can conflict. This rapidly becomes tedious
time-wasting
and
ultimately
expensive.
Standard C++ has a mechanism to prevent
this collision: the namespace keyword. Each set of C++ definitions in a
library or program is “wrapped” in a namespace
and if some other
definition has an identical name
but is in a different namespace
then there is
no collision.
Namespaces are a convenient and helpful
tool
but their presence means that you must be aware of them before you can
write any programs. If you simply include a header file and use some functions
or objects from that header
you’ll probably get strange-sounding errors
when you try to compile the program
to the effect that the compiler cannot find
any of the declarations for the items that you just included in the header file!
After you see this message a few times you’ll become familiar with its
meaning (which is “You included the header file but all the declarations
are within a namespace and you didn’t tell the compiler that you wanted to
use the declarations in that namespace”).
There’s a keyword that allows you
to say “I want to use the declarations and/or definitions in this
namespace.” This keyword
appropriately enough
is
using. All of the Standard
C++ libraries are wrapped in a single namespace
which is
std (for
“standard”). As this book uses the standard libraries almost
exclusively
you’ll see the following
using directive in almost
every program:
using namespace std;
This means that you want to expose all
the elements from the namespace called std. After this statement
you
don’t have to worry that your particular library component is inside a
namespace
since the using directive makes that namespace available
throughout the file where the using directive was
written.
Exposing all the elements from a
namespace after someone has gone to the trouble to hide them may seem a bit
counterproductive
and in fact you should be careful about thoughtlessly doing
this (as you’ll learn later in the book). However
the using
directive exposes only those names for the current file
so it is not quite as
drastic as it first sounds. (But think twice about doing it in a header file
– that is reckless.)
There’s a relationship between
namespaces and the way header files are included. Before the modern header file
inclusion was standardized (without the trailing ‘.h’
as in
<iostream>)
the typical way to include a header file was with the
‘.h’
such as <iostream.h>. At that time
namespaces were not part of the language either. So to provide backward
compatibility with existing code
if you say
#include <iostream.h>
it means
#include <iostream> using namespace std;
However
in this book the standard
include format will be used (without the ‘.h’) and so the
using directive must be explicit.
For now
that’s all you need to
know about namespaces
but in Chapter 10 the subject is covered much more
thoroughly.
A C or C++ program is a collection of
variables
function definitions
and function calls. When the program starts
it
executes initialization code and calls a special function
“main( ).” You put the primary
code for the program here.
As mentioned earlier
a function
definition consists of a return type (which must be specified in C++)
a
function name
an argument list in parentheses
and the function code contained
in braces. Here is a sample function definition:
int function() {
// Function code here (this is a comment)
}
The function above has an empty argument
list and a body that contains only a comment.
There can be many sets of braces within a
function definition
but there must always be at least one set surrounding the
function body. Since main( ) is a function
it must follow these
rules. In C++
main( ) always has return type of
int.
C and C++ are free form languages. With
few exceptions
the compiler ignores newlines and white space
so it must have
some way to determine the end of a statement. Statements are delimited by
semicolons.
C comments start with /* and end
with */. They can include newlines. C++ uses C-style comments and has an
additional type of comment: //. The // starts a comment that
terminates with a newline. It is more convenient than /* */ for one-line
comments
and is used extensively in this
book.
And now
finally
the first
program:
//: C02:Hello.cpp
// Saying Hello with C++
#include <iostream> // Stream declarations
using namespace std;
int main() {
cout << "Hello
World! I am "
<< 8 << " Today!" << endl;
} ///:~
The cout object is handed a series
of arguments via the ‘<<’ operators. It prints out
these arguments in left-to-right order. The special iostream function
endl outputs the line and a newline. With iostreams
you can string
together a series of arguments like this
which makes the class easy to use.
In C
text inside double quotes is
traditionally called a “string.” However
the
Standard C++ library has a powerful class called string for manipulating
text
and so I shall use the more precise term character array for text
inside double quotes.
The compiler creates storage for
character arrays and stores the ASCII equivalent for each character in this
storage. The compiler automatically terminates this array of characters with an
extra piece of storage containing the value 0 to indicate the end of the
character array.
Inside a character array
you can insert
special characters by using escape sequences.
These consist of a backslash (\) followed by a special code. For example
\n means newline. Your compiler manual or local C
guide gives a complete set of escape sequences; others include \t
(tab)
\\ (backslash)
and
\b (backspace).
Notice that the statement can continue
over multiple lines
and that the entire statement terminates with a
semicolon
Character array arguments and constant
numbers are mixed together in the above cout statement. Because the
operator << is overloaded with a variety of
meanings when used with cout
you can send cout a variety of
different arguments and it will “figure out what to do with the
message.”
Throughout this book you’ll notice
that the first line of each file will be a comment that starts with the
characters that start a comment (typically //)
followed by a colon
and
the last line of the listing will end with a comment followed by
‘/:~’. This is a technique I use to allow easy extraction of
information from code files (the program to do this can be found in volume two
of this book
at www.BruceEckel.com). The first line also has the name
and location of the file
so it can be referred to in text and in other files
and so you can easily locate it in the source code for this book (which is
downloadable from
www.BruceEckel.com).
After downloading and unpacking the
book’s source code
find the program in the subdirectory CO2.
Invoke the compiler with Hello.cpp as the argument. For simple
one-file
programs like this one
most compilers will take you all the way through the
process. For example
to use the GNU C++ compiler (which is freely available on
the Internet)
you write:
g++ Hello.cpp
So far you have seen only the most
rudimentary aspect of the iostreams class. The output formatting available with
iostreams also includes features such as number formatting in decimal
octal
and hexadecimal. Here’s another example of the use of
iostreams:
//: C02:Stream2.cpp
// More streams features
#include <iostream>
using namespace std;
int main() {
// Specifying formats with manipulators:
cout << "a number in decimal: "
<< dec << 15 << endl;
cout << "in octal: " << oct << 15 << endl;
cout << "in hex: " << hex << 15 << endl;
cout << "a floating-point number: "
<< 3.14159 << endl;
cout << "non-printing char (escape): "
<< char(27) << endl;
} ///:~
This example shows the iostreams class
printing numbers in decimal
octal
and hexadecimal using iostream
manipulators (which don’t print anything
but change the state of the output stream). The formatting of floating-point
numbers is determined automatically by the compiler. In addition
any character
can be sent to a stream object using a cast to a
char (a char is a
data type that holds single characters). This cast looks like a function
call: char( )
along with the character’s ASCII value. In the
program above
the char(27) sends an “escape” to
cout.
An important feature of the C
preprocessor is character array
concatenation. This feature is used in some of the
examples in this book. If two quoted character arrays are adjacent
and no
punctuation is between them
the compiler will paste the character arrays
together into a single character array. This is particularly useful when code
listings have width restrictions:
//: C02:Concat.cpp
// Character array Concatenation
#include <iostream>
using namespace std;
int main() {
cout << "This is far too long to put on a "
"single line but it can be broken up with "
"no ill effects\nas long as there is no "
"punctuation separating adjacent character "
"arrays.\n";
} ///:~
At first
the code above can look like an
error because there’s no familiar semicolon at the end of each line.
Remember that C and C++ are free-form languages
and although you’ll
usually see a semicolon at the end of each line
the actual requirement is for a
semicolon at the end of each statement
and it’s possible for a
statement to continue over several
lines.
The iostreams classes provide the ability
to read input. The object used for
standard input is
cin (for “console input”). cin
normally expects input from the console
but this input can be redirected from
other sources. An example of redirection is shown later in this
chapter.
The iostreams operator used with
cin is >>. This operator waits for the same kind of input as
its argument. For example
if you give it an integer argument
it waits for an
integer from the console. Here’s an example:
//: C02:Numconv.cpp
// Converts decimal to octal and hex
#include <iostream>
using namespace std;
int main() {
int number;
cout << "Enter a decimal number: ";
cin >> number;
cout << "value in octal = 0"
<< oct << number << endl;
cout << "value in hex = 0x"
<< hex << number << endl;
} ///:~
While the typical way to use a program
that reads from standard input and writes to standard output is within a Unix
shell script or DOS batch file
any program can be called from inside a C or C++
program using the Standard C system( )
function
which is declared in the header file
<cstdlib>:
//: C02:CallHello.cpp
// Call another program
#include <cstdlib> // Declare "system()"
using namespace std;
int main() {
system("Hello");
} ///:~
To use the system( )
function
you give it a character array that you would normally type at the
operating system command prompt. This can also include command-line arguments
and the character array can be one that you fabricate at run time (instead of
just using a static character array as shown above). The command executes and
control returns to the program.
This program shows you how easy it is to
use plain C library functions in C++; just include the header file and call the
function. This upward compatibility from C to C++ is a
big advantage if you are learning the language starting from a background in
C.
While a character array can be fairly
useful
it is quite limited. It’s simply a group of characters in memory
but if you want to do anything with it you must manage all the little details.
For example
the size of a quoted character array is fixed at compile time. If
you have a character array and you want to add some more characters to it
you’ll need to understand quite a lot (including dynamic memory
management
character array copying
and concatenation) before you can get your
wish. This is exactly the kind of thing we’d like to have an object do for
us.
The Standard C++
string class is designed to take care of (and
hide) all the low-level manipulations of character arrays that were previously
required of the C programmer. These manipulations have been a constant source of
time-wasting and errors since the inception of the C language. So
although an
entire chapter is devoted to the string class in Volume 2 of this book
the string is so important and it makes life so much easier that it will
be introduced here and used in much of the early part of the
book.
To use strings you include the C++
header file <string>. The string class is in the namespace
std so a using directive is necessary. Because of operator
overloading
the syntax for using strings is quite
intuitive:
//: C02:HelloStrings.cpp
// The basics of the Standard C++ string class
#include <string>
#include <iostream>
using namespace std;
int main() {
string s1
s2; // Empty strings
string s3 = "Hello
World."; // Initialized
string s4("I am"); // Also initialized
s2 = "Today"; // Assigning to a string
s1 = s3 + " " + s4; // Combining strings
s1 += " 8 "; // Appending to a string
cout << s1 + s2 + "!" << endl;
} ///:~
The first two strings
s1
and s2
start out empty
while s3 and s4 show two
equivalent ways to initialize string objects from character arrays (you
can just as easily initialize string objects from other string
objects).
You can assign to any string
object using ‘=’. This replaces the previous contents of the
string with whatever is on the right-hand side
and you don’t have to
worry about what happens to the previous contents – that’s handled
automatically for you. To combine strings you simply use the
‘+’ operator
which also allows you to combine character
arrays with strings. If you want to append either a string or a
character array to another string
you can use the operator
‘+=’. Finally
note that iostreams
already know what to do with strings
so you can just send a
string (or an expression that produces a string
which happens
with s1 + s2 + "!") directly to cout in order to print
it.
In C
the process of opening and
manipulating files requires a lot of language background to prepare you for the
complexity of the operations. However
the C++ iostream library provides a
simple way to manipulate files
and so this functionality can be introduced much
earlier than it would be in C.
To open files for reading and writing
you must include <fstream>. Although this
will automatically include <iostream>
it’s generally prudent
to explicitly include <iostream> if you’re planning to use
cin
cout
etc.
To open a file for reading
you create an
ifstream object
which then behaves like
cin. To open a file for writing
you create an
ofstream object
which then behaves like
cout. Once you’ve opened the file
you can read from it or write to
it just as you would with any other iostream object. It’s that simple
(which is
of course
the whole point).
One of the most useful functions in the
iostream library is
getline( )
which
allows you to read one line (terminated by a newline) into a string
object[28]. The
first argument is the ifstream object you’re reading from and the
second argument is the string object. When the function call is finished
the string object will contain the line.
Here’s a simple example
which
copies the contents of one file into another:
//: C02:Scopy.cpp
// Copy one file to another
a line at a time
#include <string>
#include <fstream>
using namespace std;
int main() {
ifstream in("Scopy.cpp"); // Open for reading
ofstream out("Scopy2.cpp"); // Open for writing
string s;
while(getline(in
s)) // Discards newline char
out << s << "\n"; // ... must add it back
} ///:~
To open the files
you just hand the
ifstream and ofstream objects the file names you want to create
as seen above.
There is a new concept introduced here
which is the
while
loop. Although this will be explained in detail in the next chapter
the basic
idea is that the expression in parentheses following the while controls
the execution of the subsequent statement (which can also be multiple
statements
wrapped inside curly braces). As long as the expression in
parentheses (in this case
getline(in
s)) produces a “true”
result
then the statement controlled by the while will continue to
execute. It turns out that getline( ) will return a value that can
be interpreted as “true” if another line has been read successfully
and “false” upon reaching the end of the input. Thus
the above
while loop reads every line in the input file and sends each line to the
output file.
getline( ) reads in the
characters of each line until it discovers a newline (the termination character
can be changed
but that won’t be an issue until the iostreams chapter in
Volume 2). However
it discards the newline and doesn’t store it in the
resulting string object. Thus
if we want the copied file to look just
like the source file
we must add the newline back in
as
shown.
//: C02:FillString.cpp
// Read an entire file into a single string
#include <string>
#include <iostream>
#include <fstream>
using namespace std;
int main() {
ifstream in("FillString.cpp");
string s
line;
while(getline(in
line))
s += line + "\n";
cout << s;
} ///:~
Because of the dynamic nature of
strings
you don’t have to worry about how much storage to allocate
for a string; you can just keep adding things and the string will
keep expanding to hold whatever you put into it.
One of the nice things about putting an
entire file into a string is that the string class has many
functions for searching and manipulation that would then allow you to modify the
file as a single string. However
this has its limitations. For one thing
it is
often convenient to treat a file as a collection of lines instead of just a big
blob of text. For example
if you want to add line numbering it’s much
easier if you have each line as a separate string object. To accomplish
this
we’ll need another
approach.
With strings
we can fill up a
string object without knowing how much storage we’re going to need.
The problem with reading lines from a file into individual string objects
is that you don’t know up front how many strings you’re going
to need – you only know after you’ve read the entire file. To solve
this problem
we need some sort of holder that will automatically expand to
contain as many string objects as we care to put into
it.
In fact
why limit ourselves to holding
string objects? It turns out that this kind of problem – not
knowing how many of something you have while you’re writing a program
– happens a lot. And this “container” object sounds like it
would be more useful if it would hold any kind of object at all!
Fortunately
the Standard C++ Library has a ready-made solution: the standard
container classes. The container classes are one of the real powerhouses of
Standard C++.
There is often a bit of confusion between
the containers and algorithms in the Standard C++ Library
and the entity known
as the STL. The
Standard Template Library was the
name Alex Stepanov (who was working at Hewlett-Packard at the time) used when he
presented his library to the C++ Standards Committee at the meeting in San
Diego
California in Spring 1994. The name stuck
especially after HP decided to
make it available for public downloads. Meanwhile
the committee integrated it
into the Standard C++ Library
making a large number of changes. STL's
development continues at
Silicon
Graphics (SGI; see http://www.sgi.com/Technology/STL). The SGI STL
diverges from the Standard C++ Library on many subtle points. So although it's a
popular misconception
the C++ Standard does not “include” the STL.
It can be a bit confusing since the containers and algorithms in the Standard
C++ Library have the same root (and usually the same names) as the SGI STL. In
this book
I will say “The Standard C++ Library” or “The
Standard Library containers
” or something similar and will avoid the term
“STL.”
Even though the implementation of the
Standard C++ Library containers and algorithms uses some advanced concepts and
the full coverage takes two large chapters in Volume 2 of this book
this
library can also be potent without knowing a lot about it. It’s so useful
that the most basic of the standard containers
the vector
is introduced
in this early chapter and used throughout the book. You’ll find that you
can do a tremendous amount just by using the basics of vector and not
worrying about the underlying implementation (again
an important goal of OOP).
Since you’ll learn much more about this and the other containers when you
reach the Standard Library chapters in Volume 2
it seems forgivable if the
programs that use vector in the early portion of the book aren’t
exactly what an experienced C++ programmer would do. You’ll find that in
most cases
the usage shown here is adequate.
The vector class is a
template
which means that it can be efficiently
applied to different types. That is
we can create a vector of
shapes
a vector of cats
a vector of
strings
etc. Basically
with a template you can create a “class of
anything.” To tell the compiler what it is that the class will work with
(in this case
what the vector will hold)
you put the name of the
desired type in “angle brackets
” which means ‘<’ and
‘>’. So a vector of string would be denoted
vector<string>. When you do this
you end up with a customized
vector that will hold only string objects
and you’ll get an error
message from the compiler if you try to put anything else into
it.
Since vector expresses the concept
of a “container
” there must be a way to put things into the
container and get things back out of the container. To
add a brand-new element on the end of a vector
you use the member
function push_back( ).
(Remember that
since it’s a member function
you use a
‘.’ to call it for a particular object.) The reason the name
of this member function might seem a bit verbose –
push_back( ) instead of something simpler like “put”
– is because there are other containers and other member functions for
putting new elements into containers. For example
there is an
insert( ) member
function to put something in the middle of a container. vector supports
this but its use is more complicated and we won’t need to explore it until
Volume 2 of the book. There’s also a
push_front( ) (not part of vector) to
put things at the beginning. There are many more member functions in
vector and many more containers in the Standard C++ Library
but
you’ll be surprised at how much you can do just knowing about a few simple
features.
So you can put new elements into a
vector with push_back( )
but how do you get these elements
back out again? This solution is more clever and elegant – operator
overloading is used to make the vector look like an array. The
array (which will be described more fully in the next chapter) is a data type
that is available in virtually every programming language so you should already
be somewhat familiar with it. Arrays are
aggregates
which mean they consist of a number of
elements clumped together. The distinguishing characteristic of an array is that
these elements are the same size and are arranged to be one right after the
other. Most importantly
these elements can be selected by
“indexing
” which means you can say “I want element number
n” and that element will be produced
usually quickly. Although there are
exceptions in programming languages
the indexing is normally achieved using
square brackets
so if you have an array a and you want to produce
element five
you say a[4] (note that
indexing always starts at
zero).
This very compact and powerful indexing
notation is incorporated into the vector using operator overloading
just
like ‘<<’ and ‘>>’ were
incorporated into iostreams. Again
you don’t need to know how the
overloading was implemented – that’s saved for a later chapter
– but it’s helpful if you’re aware that there’s some
magic going on under the covers in order to make the [ ] work with
vector.
With that in mind
you can now see a
program that uses vector. To use a vector
you include the header
file <vector>:
//: C02:Fillvector.cpp
// Copy an entire file into a vector of string
#include <string>
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
int main() {
vector<string> v;
ifstream in("Fillvector.cpp");
string line;
while(getline(in
line))
v.push_back(line); // Add the line to the end
// Add line numbers:
for(int i = 0; i < v.size(); i++)
cout << i << ": " << v[i] << endl;
} ///:~
Much of this program is similar to the
previous one; a file is opened and lines are read into string objects one
at a time. However
these string objects are pushed onto the back of the
vector v. Once the while loop completes
the entire file is
resident in memory
inside v.
The next statement in the program is
called a
for
loop. It is similar to a while loop except that it adds some extra
control. After the for
there is a “control
expression” inside of parentheses
just like the while loop.
However
this control expression is in three parts: a part which initializes
one that tests to see if we should exit the loop
and one that changes
something
typically to step through a sequence of items. This program shows the
for loop in the way you’ll see it most commonly used: the
initialization part int i = 0 creates an integer
i to use as a loop counter and gives it an initial value of zero. The
testing portion says that to stay in the loop
i should be less than the
number of elements in the vector v. (This is produced using the member
function size( )
which I just sort of slipped in here
but you must
admit it has a fairly obvious meaning.) The final portion uses a shorthand for C
and C++
the
“auto-increment”
operator
to add one to the value of i. Effectively
i++ says
“get the value of i
add one to it
and put the result back into
i. Thus
the total effect of the for loop is to take a variable
i and march it through the values from zero to one less than the size of
the vector. For each value of i
the cout statement is
executed and this builds a line that consists of the value of i
(magically converted to a character array by cout)
a colon and a space
the line from the file
and a newline provided by endl. When you compile
and run it you’ll see the effect is to add line numbers to the
file.
Because of the way that the
‘>>’
operator works with iostreams
you can easily modify the program above so that
it breaks up the input into
whitespace-separated words instead of lines:
//: C02:GetWords.cpp
// Break a file into whitespace-separated words
#include <string>
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
int main() {
vector<string> words;
ifstream in("GetWords.cpp");
string word;
while(in >> word)
words.push_back(word);
for(int i = 0; i < words.size(); i++)
cout << words[i] << endl;
} ///:~
The expression
while(in >> word)
is what gets the input one
“word” at a time
and when this expression evaluates to
“false” it means the end of the file has been reached. Of course
delimiting words by whitespace is quite crude
but it makes for a simple
example. Later in the book you’ll see more sophisticated examples that let
you break up input just about any way you’d like.
To demonstrate how easy it is to use a
vector with any type
here’s an example that creates a
vector<int>:
//: C02:Intvector.cpp
// Creating a vector that holds integers
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> v;
for(int i = 0; i < 10; i++)
v.push_back(i);
for(int i = 0; i < v.size(); i++)
cout << v[i] << "
;
cout << endl;
for(int i = 0; i < v.size(); i++)
v[i] = v[i] * 10; // Assignment
for(int i = 0; i < v.size(); i++)
cout << v[i] <<
;
cout << endl;
} ///:~
To create a vector that holds a
different type
you just put that type in as the template argument (the argument
in angle brackets). Templates and well-designed template libraries are intended
to be exactly this easy to use.
This example goes on to demonstrate
another essential feature of vector. In the expression
v[i] = v[i] * 10;
you can see that the vector is not
limited to only putting things in and getting things out. You also have the
ability to assign (and thus to change) to any element of a
vector
also through the use of the
square-brackets indexing operator. This means that vector is a
general-purpose
flexible “scratchpad” for working with collections
of objects
and we will definitely make use of it in coming
chapters.
The intent of this chapter is to show you
how easy object-oriented programming can be – if someone else has
gone to the work of defining the objects for you. In that case
you include a
header file
create the objects
and send messages to them. If the types you are
using are powerful and well-designed
then you won’t have to do much work
and your resulting program will also be powerful.
In the process of showing the ease of OOP
when using library classes
this chapter also introduced some of the most basic
and useful types in the Standard C++ library: the family of iostreams (in
particular
those that read from and write to the console and files)
the
string class
and the vector template. You’ve seen how
straightforward it is to use these and can now probably imagine many things you
can accomplish with them
but there’s actually a lot more that
they’re capable
of[29]. Even though
we’ll only be using a limited subset of the functionality of these tools
in the early part of the book
they nonetheless provide a large step up from the
primitiveness of learning a low-level language like C. and while learning the
low-level aspects of C is educational
it’s also time consuming. In the
end
you’ll be much more productive if you’ve got objects to manage
the low-level issues. After all
the whole point of OOP is to hide the
details so you can “paint with a bigger brush.”
However
as high-level as OOP tries to
be
there are some fundamental aspects of C that you can’t avoid knowing
and these will be covered in the next
chapter.
Solutions to selected exercises
can be found in the electronic document The Thinking in C++ Annotated
Solution Guide
available for a small fee from
http://www.BruceEckel.com
[25]
The boundary between compilers and interpreters can tend to become a bit fuzzy
especially with Python
which has many of the features and power of a compiled
language but the quick turnaround of an interpreted language.
[26]
Python is again an exception
since it also provides separate
compilation.
[27]
I would recommend using Perl or Python to automate this task as part of your
library-packaging process (see www.Perl.org or www.Python.org).
[28]
There are actually a number of variants of getline( )
which will be
discussed thoroughly in the iostreams chapter in Volume 2.
[29]
If you’re particularly eager to see all the things that can be done with
these and other Standard library components
see Volume 2 of this book at
www.BruceEckel.com
and also www.dinkumware.com.