Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.Please state the document version you're referring to, as found in the title (in this document: 6.2.3) and please state chapter and paragraph name or number you're referring to.
All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.
This chapter several concrete examples of C++ programs, classes and
templates will be presented. Topics covered by this document such as virtual
functions, static members, etc. are illustrated in this chapter. The
examples roughly follow the organization of earlier chapters.
First, examples using stream classes are presented, including some
detailed examples illustrating polymorphism. With the advent of the
ANSI/ISO standard, classes supporting streams based on
file descriptors are no longer available, including the
Gnu procbuf
extension. These classes were frequently used in older C++ programs. This
section of the C++ Annotations develops an alternative: classes extending
streambuf, allowing the use of file descriptors, and classes around the
fork()
system call.
Next, several templates will be developed, both template functions and full template classes.
Finally, we'll touch the subjects of scanner and parser generators, and show how these tools may be used in C++ programs. These final examples assume a certain familiarity with the concepts underlying these tools, like grammars, parse-trees and parse-tree decoration. Once the input for a program exceeds a certain level of complexity, it's advantageous to use scanner- and parser-generators to produce code doing the actual input recognition. One of the examples in this chapter describes the usage of these tool in a C++ environment.
streambuf as the starting
point for constructing classes interfacing file descriptors.
In this section we will construct classes which may be used to write to a device identified by a file descriptor: it may be a file, but it could also be a pipe or socket. Section 20.1.2 discusses reading from devices given their file descriptors, while section 20.3.1 reconsiders redirection, discussed earlier in section 5.8.3.
Basically, deriving a class for
output operations is simple. The only
member function that must be overridden is the
virtual member int
overflow(int c)
. This member is responsible for
writing characters to the device once the class' buffer is full. If fd is
a file descriptor to which information may be written, and if we decide
against using a buffer then the member overflow() can simply be:
class UnbufferedFD: public std::streambuf
{
public:
int overflow(int c)
{
if (c != EOF)
{
if (write(fd, &c, 1) != 1)
return EOF;
}
return c;
}
...
}
The argument received by overflow() is either written as a value of
type char to the file descriptor, or EOF is returned.
This simple function does not use an output buffer. As the use of a buffer is strongly advised (see also the next section), the construction of a class using an output buffer will be discussed next in somewhat greater detail.
When an output buffer is used, the overflow() member will be a bit
more complex, as it is now only called when the buffer is full. Once the
buffer is full, we first have to flush the buffer, for which the (virtual)
function
streambuf::sync() is available. Since sync() is a virtual
function, classes derived from std::streambuf may redefine sync() to
flush a buffer std::streambuf itself doesn't know about.
Overriding sync() and using it in overflow() is not all that must
be done: eventually we might have less information than fits into the
buffer. So, at the end of the
lifetime of our special streambuf object,
its buffer might only be partially full. Therefore, we must make sure that the
buffer is flushed once our object goes
out of scope. This is of course very
simple: sync() should be called by the
destructor as well.
Now that we've considered the consequences of using an output buffer, we're almost ready to construct our derived class. We will add a couple of additional features, though.
ofdnstreambuf has the following
characteristics:
std::streambuf:
class ofdnstreambuf: public std::streambuf
unsigned d_bufsize;
int d_fd;
char *d_buffer;
open() member (see
below). Here are the constructors:
ofdnstreambuf()
:
d_bufsize(0),
d_buffer(0)
{}
ofdnstreambuf(int fd, unsigned bufsize = 1)
{
open(fd, bufsize);
}
sync(), writing any characters stored in the output buffer to the
device. If there's no buffer, the destructor needs to perform no actions:
~ofdnstreambuf()
{
if (d_buffer)
{
sync();
delete [] d_buffer;
}
}
open() member initializes the buffer. Using
setp() the begin and end points of the buffer are
set. This is used by the streambuf base class, to initialize
pbase()
pptr() and
epptr():
void open(int fd, unsigned bufsize = 1)
{
d_fd = fd;
d_bufsize = bufsize == 0 ? 1 : bufsize;
d_buffer = new char[d_bufsize];
setp(d_buffer, d_buffer + d_bufsize);
}
sync() will write any not yet flushed characters in in
the buffer to the device. Next, the buffer is reinitialized using
setp(). Note that sync() returns 0 after a successfull flush
operation:
int sync()
{
if (pptr() > pbase())
{
write(d_fd, d_buffer, pptr() - pbase());
setp(d_buffer, d_buffer + d_bufsize);
}
return 0;
}
overflow() is
overridden. Since this member is called from the streambuf base
class when the buffer is full, sync() is called first to flush the filled
up buffer to the device. As this recreates an empty buffer, the character
c which could not be written to the buffer by the streambuf base class
is now entered into the buffer using the member functions pptr() and
pbump(). Notice that entering a character into the
buffer is realized using available streambuf member functions, rather than
doing it `by hand', which might invalidate streambuf's internal
bookkeeping:
int overflow(int c)
{
sync();
if (c != EOF)
{
*pptr() = c;
pbump(1);
}
return c;
}
streambuf the header file
unistd.h must have been read by the compiler before the implementations of
the member functions can be compiled.
ofdstreambuf class to copy its standard input to file descriptor
STDOUT_FILENO, which is the
symbolic name of the file descriptor used
for the standard output. Here is the program:
#include <string>
#include <iostream>
#include <istream>
#include "fdout.h"
using namespace std;
int main(int argc)
{
ofdnstreambuf fds(STDOUT_FILENO, 500);
ostream os(&fds);
switch (argc)
{
case 1:
os << "COPYING cin LINE BY LINE\n";
for (string s; getline(cin, s); )
os << s << endl;
break;
case 2:
os << "COPYING cin BY EXTRACTING TO os.rdbuf()\n";
cin >> os.rdbuf(); // Alternatively, use: cin >> &fds;
break;
case 3:
os << "COPYING cin BY INSERTING cin.rdbuf() into os\n";
os << cin.rdbuf();
break;
}
}
std::
streambuf, the
class should use an input buffer of at least one character, to allow the use
of the member functions
istream::putback() or
istream::ungetc().
Stream classes (like
istream) normally allow us to unget at least one
character using their member functions putback() or ungetc(). This is
important, as these stream classes usually interface to
streambuf
objects. Although it is strictly speaking not necessary to implement a buffer
in classes derived from streambuf, using buffers in these cases is
strongly advised: the implementation is very simple and straightforward, and
the applicability of such classes will be greatly improved. Therefore, in all
our classes
derived from the class streambuf at least a
buffer of one
character will be defined.
20.1.2.1: Using a one-character buffer
When deriving a class (e.g.,
ifdstreambuf) from streambuf using a
buffer of one character, at least its member streambuf::underflow() should
be overridden, as this is the member to which all requests for input are
eventually directed. Since a buffer is also needed, the member
streambuf::setg() is used to inform the streambuf base class of the
size of the input buffer, so that it is able to set up its input buffer
pointers correctly. This will ensure that
eback(),
gptr(), and
egptr() return correct values.
The required class shows the following characteristics:
std::
streambuf as well:
class ifdstreambuf: public std::streambuf
protected data members
so that derived classes (e.g., see section 20.1.2.3) can access them:
protected:
int d_fd;
char d_buffer[1];
gptr() will be equal to
egptr(). Since this implies
that the buffer is empty, underflow() will immediately be called to refill
the buffer:
ifdstreambuf(int fd)
:
d_fd(fd)
{
setg(d_buffer, d_buffer + 1, d_buffer + 1);
}
underflow() is overridden. It will first ensure that the
buffer is really empty. If not, then the next character in the buffer is
returned. If the buffer is really empty, it is refilled by reading from the
file descriptor. If this fails (for whatever reason), EOF is
returned. More sophisticated implementations could react more intelligently
here, of course. If the buffer could be refilled,
setg() is called to set
up streambuf's buffer pointers correctly.
streambuf the header file
unistd.h must have been read by the compiler before the implementations of
the member functions can be compiled.
ifdstreambuf class. It is used
in the following program:
#include <iostream>
#include <istream>
#include "ifdbuf.h"
using namespace std;
int main(int argc)
{
ifdstreambuf fds(0);
istream is(&fds);
cout << is.rdbuf();
}
20.1.2.2: Using an n-character buffer
How complex would things get if we would decide to use a buffer of
substantial size? Not that complex. The following class allows us to specify
the size of a buffer, but apart from that it is basically the same class as
ifdstreambuf developed in the previous section. To make things a bit more
interesting, in the class
ifdnstreambuf developed here the member
streambuf::xsgetn() is also overridden, to optimize reading of series of
characters. Furthermore, a default constructor is provided which can be used
in combination with the open() member to construct an istream object
before the file descriptor becomes available, Then, once the descriptor
becomes available the open() member can be used to initiate the object's
buffer. Later, in section 20.3 we'll encounter such a situation.
To save some space, the successful operation of various calls were not
checked. In `real life' implementations these checks should of course not be
omitted. The class ifdnstreambuf has the following characteristics:
std::
streambuf:
class ifdnstreambuf: public std::streambuf
ifdbuf (section 20.1.2.1) its data members are
protected. Since the buffer's size is configurable, this size is kept in a
dedicated data member d_bufsize:
protected:
int d_fd;
unsigned d_bufsize;
char* d_buffer;
open() which will then
initialize the object so that it can actually be used:
ifdnstreambuf()
:
d_bufsize(0),
d_buffer(0)
{}
ifdnstreambuf(int fd, unsigned bufsize = 1)
{
open(fd, bufsize);
}
open(), its destructor will
both delete the object's buffer and use the file descriptor to close the
device:
~ifdnstreambuf()
{
if (d_bufsize)
{
close(d_fd);
delete [] d_buffer;
}
}
open() member simply allocates the object's buffer. It is
assumed that the calling program has already opened the device. Once the
buffer has been allocated, the base class member setg() is used to
ensure that
eback(),
gptr(), and
egptr() return correct values:
void open(int fd, unsigned bufsize = 1)
{
d_fd = fd;
d_bufsize = bufsize;
d_buffer = new char[d_bufsize];
setg(d_buffer, d_buffer + d_bufsize, d_buffer + d_bufsize);
}
underflow() is implemented almost
identically to ifdstreambuf's (section 20.1.2.1) member. The only
difference is that the current class supports a buffer of larger
sizes. Therefore, more character (up to d_bufsize) may be read from the
device at once:
int underflow()
{
if (gptr() < egptr())
return *gptr();
int nread = read(d_fd, d_buffer, d_bufsize);
if (nread <= 0)
return EOF;
setg(d_buffer, d_buffer, d_buffer + nread);
return *gptr();
}
xsgetn() is overridden. In a loop n is reduced until
0, at which point the function terminates. Alternatively, the member returns
if underflow() fails to obtain more characters. This member optimizes the
reading of series of characters: instead of calling
streambuf::sbumpc()
n times, a block of avail characters is copied to the destination,
using
streambuf::gpumb() to consume avail characters from the buffer
using one function call:
std::streamsize xsgetn(char *dest, std::streamsize n)
{
int nread = 0;
while (n)
{
if (!in_avail())
{
if (underflow() == EOF)
break;
}
int avail = in_avail();
if (avail > n)
avail = n;
memcpy(dest + nread, gptr(), avail);
gbump(avail);
nread += avail;
n -= avail;
}
return nread;
}
streambuf the header file
unistd.h must have been read by the compiler before the implementations of
the member functions can be compiled.
xsgetn() is called by
streambuf::sgetn(),
which is a streambuf member. The following example illustrates the use of
this member function with a ifdnstreambuf object:
#include "ifdnbuf.h"
#include <iostream>
#include <istream>
using namespace std;
int main(int argc)
{
ifdnstreambuf fds(0, 30); // internally: 30 char buffer
char buf[80]; // main() reads blocks of 80
// chars
while (true)
{
unsigned n = fds.sgetn(buf, 80);
if (n == 0)
break;
cout.write(buf, n);
}
}
20.1.2.3: Seeking positions in `streambuf' objects
When devices support seek operations, classes derived from
streambuf should override
streambuf::seekoff() and
streambuf::seekpos(). The class
ifdseek, developed in this section can
be used to read information from devices supporting such seek operations. The
class ifdseek was derived from ifdstreambuf, so it uses a character
buffer of just one character. The facilities to perform seek operations, which
are added to our new class ifdseek will make sure that the input buffer is
reset when a seek operation is requested. The class could also be derived from
the class ifdnstreambuf, in which case the arguments to reset the input
buffer must be adapted in such a way that its second and third parameters
point beyond the available input buffer. Let's have a look at the
characteristics of ifdseek:
ifdseek is derived from ifdstreambuf. Like the
latter class, ifdseek's member functions use facilities declared in
unistd.h. So, the compiler must have seen unistd.h before it can
compile the class' members functions. The class interface itself starts with:
class ifdseek: public ifdstreambuf
std::streambuf and std::ios several
typedefs are defined at
the class' very top:
typedef std::streambuf::pos_type pos_type;
typedef std::streambuf::off_type off_type;
typedef std::ios::seekdir seekdir;
typedef std::ios::openmode openmode;
These typedefs refer to types that are defined in the header file
ios, which must therefore be included as well before the compiler reads
ifdseek's class definition.
ifdseek(int fd)
:
ifdstreambuf(fd)
{}
seek_off() is responsible for performing the actual
seek operations. It calls
lseek() to seek a new position in a device whose
file descriptor is known. If seeking succeeds, setg() is called to define
an already empty buffer, so that the base class' underflow() member
will refill the buffer at the next input request.
pos_type seekoff(off_type offset, seekdir dir, openmode)
{
pos_type pos =
lseek
(
d_fd, offset,
(dir == std::ios::beg) ? SEEK_SET :
(dir == std::ios::cur) ? SEEK_CUR :
SEEK_END
);
if (pos < 0)
return -1;
setg(d_buffer, d_buffer + 1, d_buffer + 1);
return pos;
}
seekpos is overridden as well:
it is actually defined as a call to seekoff():
pos_type seekpos(pos_type offset, openmode mode)
{
return seekoff(offset, std::ios::beg, mode);
}
ifdseek is the following. If
this program is given its own source file using input
redirection then
seeking is supported, and with the exception of the first line, every other
line is shown twice:
#include "fdinseek.h"
#include <string>
#include <iostream>
#include <istream>
#include <iomanip>
using namespace std;
int main(int argc)
{
ifdseek fds(0);
istream is(&fds);
string s;
while (true)
{
if (!getline(is, s))
break;
streampos pos = is.tellg();
cout << setw(5) << pos << ": `" << s << "'\n";
if (!getline(is, s))
break;
streampos pos2 = is.tellg();
cout << setw(5) << pos2 << ": `" << s << "'\n";
if (!is.seekg(pos))
{
cout << "Seek failed\n";
break;
}
}
}
20.1.2.4: Multiple `unget()' calls in `streambuf' objects
As mentioned before,
streambuf classes and classes derived from
streambuf should support at least ungetting the last read
character. Special care must be taken when series of unget() calls
must be supported. In this section the construction of a class supporting a
configurable number of
istream::unget() or
istream::putback() calls.
Support for multiple (say `n') unget() calls is realized by
reserving an initial section of the input buffer, which is gradually filled up
to contain the last n characters read. The class was implemented as
follows:
std::
streambuf. It
defines several data members, allowing the class to perform the bookkeeping
required to maintain an unget-buffer of a configurable size:
class fdunget: public std::streambuf
{
int d_fd;
unsigned d_bufsize;
unsigned d_reserved;
char* d_buffer;
char* d_base;
d_reserved bytes of the class' input buffer.
d_reserved. So, a certain number of bytes may be read. Then, once
reserved bytes have been read at least reserved bytes can be ungot.
d_base, located reserved bytes into d_buffer. This will
always the point where the buffer refills start.
streambuf's buffer pointers using setg(). As no characters have been
read yet, all pointers are set to point to d_base. If unget() is
called at this point, no characters are available, so unget() will
(correctly) fail.
fdunget (int fd, unsigned bufsz, unsigned unget)
:
d_fd(fd),
d_reserved(unget)
{
unsigned allocate =
bufsz > d_reserved ?
bufsz
:
d_reserved + 1;
d_buffer = new char [allocate];
d_base = d_buffer + d_reserved;
setg(d_base, d_base, d_base);
d_bufsize = allocate - d_reserved;
}
~fdunget()
{
delete [] d_buffer;
}
underflow() is overridden.
d_reserved, but it is equal to the actual
number of characters that can be ungot if this value is smaller.
d_base.
d_base and not from d_buffer.
streambuf's read buffer pointers are set up.
Eback() is set to move locations before d_base, thus
defining the guaranteed unget-area,
gptr() is set to d_base, since that's the location of the
first read character after a refill, and
egptr() is set just beyond the location of the last character
read into the buffer.
underflow()'s implementation:
int underflow()
{
if (gptr() < egptr())
return *gptr();
unsigned ungetsize = gptr() - eback();
unsigned move = std::min(ungetsize, d_reserved);
memcpy(d_base - move, egptr() - move, move);
int nread = read(d_fd, d_base, d_bufsize);
if (nread <= 0) // none read -> return EOF
return EOF;
setg(d_base - move, d_base, d_base + nread);
return *gptr();
}
};
class fdunget. It reads at most
10 characters from the standard input, stopping at EOF. A guaranteed
unget-buffer of 2 characters is defined, in a buffer of 3 characters. Just
before reading a character it is tried to unget at most 6 characters. This is
of course impossible, but the program will nicely unget as many characters as
possible, considering the actual number of characters read:
#include "fdunget.h"
#include <string>
#include <iostream>
#include <istream>
using namespace std;
int main(int argc)
{
fdunget fds(0, 3, 2);
istream is(&fds);
char c;
for (int idx = 0; idx < 10; ++idx)
{
cout << "after reading " << idx << " characters:\n";
for (int ug = 0; ug <= 6; ++ug)
{
if (!is.unget())
{
cout
<< "\tunget failed at attempt " << (ug + 1) << "\n"
<< "\trereading: '";
is.clear();
while (ug--)
{
is.get(c);
cout << c;
}
cout << "'\n";
break;
}
}
if (!is.get(c))
{
cout << " reached\n";
break;
}
cout << "Next character: " << c << endl;
}
}
/*
Generated output after 'echo abcde | program':
after reading 0 characters:
unget failed at attempt 1
rereading: ''
Next character: a
after reading 1 characters:
unget failed at attempt 2
rereading: 'a'
Next character: b
after reading 2 characters:
unget failed at attempt 3
rereading: 'ab'
Next character: c
after reading 3 characters:
unget failed at attempt 4
rereading: 'abc'
Next character: d
after reading 4 characters:
unget failed at attempt 4
rereading: 'bcd'
Next character: e
after reading 5 characters:
unget failed at attempt 4
rereading: 'cde'
Next character:
after reading 6 characters:
unget failed at attempt 4
rereading: 'de
'
reached
*/
istream objects operator>>(), the
standard extraction operator, is perfectly suited for the task as in most
cases the extracted fields are white-space or otherwise clearly separated from
each other. But this does not hold true in all situations. For example, when a
web-form is posted to some processing script or program, the receiving program
may receive the form field's values as
urlencoded characters: letters
and blanks are sent unaltered, blanks are sent as + characters, and all
other characters start with % followed by the character's
ascii-value represented by its two digit hexadecimal value.
When decoding urlencoded information, a simple hexadecimal extraction
won't work, since that will extract as many hexadecimal characters as
available, instead of just two. Since the letters a-f and 0-9 are
legal hexadecimal characters, a text like My name is `Ed', urlencoded as
My+name+is+%60Ed%27
will result in the extraction of the hexadecimal values 60ed and
27, instead of 60 and 27. The name Ed will disappear from
view, which is clearly not what we want.
In this case, having seen the % we could extract 2 characters, put
them in an
istringstream object, and extract the hexadecimal value from
the istringstream object. A bit cumbersome, but doable. Other approaches,
however, are possible as well.
The following class
fistream for fixed-sized field istream defines
an istream class supporting both fixed-sized field extractions and
blank-delimited extractions (as well as unformatted read() calls). The
class may be initialized as a
wrapper around an existing istream, or
it can be initialized using the name of an existing file. The class is derived
from istream, allowing all extractions and operations supported by
istreams in general. The class will need the following data members:
d_filebuf: a filebuffer used when fistream reads its information
from a named (existing) file. Since the filebuffer is only needed in
that case, and since it must be allocated dynamically, its is defined
as an auto_ptr<filebuf> object.
d_streambuf: a pointer to fistream's streambuf. It will point
to filebuf when fistream opens a file by name. When an
existing istream is used to construct an fistream it will
point to the existing istream's streambuf.
d_iss: a istringstream object which is used for the fixed field
extractions.
d_width: an unsigned indicating the width of the field to
extract. If 0 no fixed field extractions will be used, but
information will be extracted from the istream base class object
using standard extractions.
fistream's class interface:
class fistream: public std::istream
{
std::auto_ptr<std::filebuf> d_filebuf;
std::streambuf *d_streambuf;
std::istringstream d_iss;
unsigned d_width;
As mentioned, fistream objects can be constructed from either a
filename or an existing istream object. Thus, the class interface shows
two constructors:
fistream(std::istream &stream);
fistream(char const *name,
std::ios::openmode mode = std::ios::in);
When an fistream object is constructed using an existing istream
object, the fistream's istream part is simply using the stream's
streambuf object:
fistream::fistream(istream &stream)
:
istream(stream.rdbuf()),
d_streambuf(rdbuf()),
d_width(0)
{}
When an fstream object is constructed using a filename, the
istream base initializer is given a new filebuf object to be used as
its streambuf. Since the class' data members are not initialized before
the class' base class has been constructed, d_filebuf can only be
initialized thereafter. By then the filebuf is only available as
rdbuf(), which returns a streambuf. However, as it is actually a
filebuf a reinterpret_cast is used to cast the streambuf pointer
returned by rdbuf() to a filebuf *, so d_filebuf can be
initialized:
fistream::fistream(char const *name, ios::openmode mode)
:
istream(new filebuf()),
d_filebuf(reinterpret_cast<filebuf *>(rdbuf())),
d_streambuf(d_filebuf.get()),
d_width(0)
{
d_filebuf->open(name, mode);
}
There is only one additional public member: setField(field const
&). This member is used to define the size of the next field to extract. Its
parameter is a reference to a field class, a
manipulator class
defining the width of the next field.
Since a field & is mentioned in fistream's interface, field
must be declared before fistream's interface starts. The class field
itself is simple: it declares fistream as its friend, and it has two data
members: d_width specifies the width of the next field, d_newWidth is
set to true if d_width's value should actually be used. If
d_newWidth is false, fistream will return to its standard extraction
mode. The class field furthermore has two constructors: a default
constructor, setting d_newWidth to false and a second constructor
expecting the width of the next field to extract as its value. Here is the
class field:
class field
{
friend class fistream;
unsigned d_width;
bool d_newWidth;
public:
field(unsigned width)
:
d_width(width),
d_newWidth(true)
{}
field()
:
d_newWidth(false)
{}
};
Since field declares fistream as its friend, setField may
inspect field's members directly.
Time to return to setField(). This function expects a reference to a
field object, initialized in either of three different ways:
field(): When setField()'s argument is a field object
constructed by its default constructor the next extraction will use
the same fieldwidth as the previous extraction.
field(0): When this field object is used as setField()'s
argument, fixed-sized field extraction stops, and the fistream
will act like any standard istream object.
field(x): When the field object itself is initialized by a
non-zero unsigned value x, then the next field width will be x
characters wide. The preparation of such a field is left to
setBuffer(), fistream's only private member.
setField()'s implementation:
std::istream &fistream::setField(field const ¶ms)
{
if (params.d_newWidth) // new field size requested
d_width = params.d_width; // set new width
if (!d_width) // no width?
rdbuf(d_streambuf); // return to the old buffer
else
setBuffer(); // define the extraction buffer
return *this;
}
The private member setBuffer() defines a buffer of d_width + 1
characters, and uses read() to fill the buffer with d_width
characters. The buffer is terminated by an ASCII-Z character. This buffer
is then used to initialize the d_str member. Finally, fistream's
rdbuf() member is used to extract the d_str's data via the
fistream object itself:
void fistream::setBuffer()
{
char *buffer = new char[d_width + 1];
rdbuf(d_streambuf); // use istream's buffer to
buffer[read(buffer, d_width).gcount()] = 0; // read d_width chars,
// terminated by ascii-Z
d_iss.str(buffer);
delete buffer;
rdbuf(d_iss.rdbuf()); // switch buffers
}
Although setField() could be used to configure fistream to use or
not to use fixed-sized field extraction, using
manipulators is probably
preferable. To allow field objects to be used as manipulators an
overloaded extraction operator was defined, accepting an istream & and a
field const & object. Using this extraction operator, statements like
fis >> field(2) >> x >> field(0);are possible (assuming
fis is a fistream object). Here is the
overloaded operator>>(), as well as its declaration:
istream &std::operator>>(istream &str, field const ¶ms)
{
return reinterpret_cast<fistream *>(&str)->setField(params);
}
Declaration:
namespace std
{
istream &operator>>(istream &str, FBB::field const ¶ms);
}
Finally, an example. The following program uses a fistream object to
url-decode url-encoded information appearing at its standard input:
int main()
{
fistream fis(cin);
fis >> hex;
while (true)
{
unsigned x;
switch (x = fis.get())
{
case '\n':
cout << endl;
break;
case '+':
cout << ' ';
break;
case '%':
fis >> field(2) >> x >> field(0);
// FALLING THROUGH
default:
cout << static_cast<char>(x);
break;
case EOF:
return 0;
}
}
}
/*
Generated output after:
echo My+name+is+%60Ed%27 | a.out
My name is `Ed'
*/
fork()
system call is well
known. When a program needs to start a new process,
system() can be used,
but this requires the program to wait for the
child process to
terminate. The more general way to spawn subprocesses is to call fork().
In this section we will see how C++ can be used to wrap classes around a
complex system call like fork(). Much of what follows in this section
directly applies to the
Unix
operating system, and the discussion will
therefore focus on that operating system. However, other systems usually
provide comparable facilities. The following discussion is based heavily on
the notion of
design patterns, as published by Gamma et al. (1995)
When fork() is called, the current program is duplicated in memory,
thus creating a new process, and both processes continue their execution just
below the fork() system call. The two processes may, however, inspect the
return value of fork(): the return value in the original process (called
the
parent process) differs from the return value in the newly created
process (called the
child process):
fork() returns the
process ID of
the child process created by the fork() system call. This is a positive
integer value.
fork() returns 0.
fork() fails, -1 is returned.
A basic Fork class should hide all bookkeeping details of a system
call like fork() from its users. The class Fork developed here will do
just that. The class itself only needs to take care of the proper execution of
the fork() system call. Normally, fork() is called to start a child
process, usually boiling down to the execution of a separate process. This
child process may expect input at its standard input stream and/or may
generate output to its standard output and/or standard error streams. Fork
does not know all this, and does not have to know what the child process will
do. However, Fork objects should be able to activate their child
processes.
Unfortunately, Fork's constructor cannot know what actions its child
process should perform. Similarly, it cannot know what actions the parent
process should perform. For this particular situation, the
template method design pattern
was developed. According to Gamma c.s., the template method design
pattern
``Define(s) the skeleton of an algorithm in an operation, deferring some steps to subclasses. (The) Template Method (design pattern) lets subclasses redefine certain steps of an algorithm, without changing the algorithm's structure.''
This design pattern allows us to define an
abstract base class
already implementing the essential steps related to the fork() system
call and deferring the implementation of certain normally used parts of the
fork() system call to subclasses.
The Fork abstract base class itself has the following characteristics:
d_pid. This data member will contain
the child's
process id (in the parent process) and the value 0 in the
child process:
class Fork
{
int d_pid;
fork() member function, performing the actual forking
(i.e., it will create the (new) child process);
virtual destructor ~Fork(), which may be
overridden by derived classes.
Fork's complete public interface:
virtual ~Fork()
{}
void fork();
protected section and can thus only be used by derived classes. They
are:
pid(), allowing derived classes to
access the system fork()'s return value:
int pid()
{
return d_pid;
}
int waitForChild(), which can be called by parent
processes to wait for the completion of their child processes (as discussed
below). This member is declared in the class interface. Its implementation is
#include "fork.ih"
int Fork::waitForChild()
{
int status;
waitpid(d_pid, &status, 0);
return WEXITSTATUS(status);
}
This simple implementation returns the child's
exit status to
the parent. The called system function
waitpid() blocks until the
child terminates.
fork() system calls are used
parent processes
and
child processes
may always be distinguished. The
main distinction between these processes is that d_pid will be equal to
the child's process-id while d_pid will be equal to 0 in the child process
itself. Since these two processes may always be distinguished, they must be
implemented by classes derived from Fork. To enforce this requirement, the
members childProcess(), defining the child process' actions and
parentProcess(), defining the parent process' actions we defined as pure
virtual functions:
virtual void childProcess() = 0; // both must be implemented
virtual void parentProcess() = 0;
cin, cout) or cerr in the must be redirected in the
child process (cf. section 20.3.1);
cin, cout) or cerr in the must be redirected in the
parent process.
virtual void childRedirections()
{}
virtual void parentRedirections()
{}
fork() calls the system function fork()
(Caution: since the system function fork() is called by a member
function having the same name, the :: scope resolution operator must be
used to prevent a recursive call of the member function itself). After calling
::fork(), depending on its return value, either parentProcess()
or childProcess() is called. Maybe redirection is
necessary. Fork::fork()'s implementation calls childRedirections()
just before calling childProcess(), and parentRedirections() just
before calling parentProcess():
#include "fork.ih"
void Fork::fork()
{
if ((d_pid = ::fork()) < 0)
throw "Fork::fork() failed";
if (d_pid == 0) // childprocess has pid == 0
{
childRedirections();
childProcess();
exit(1); // we shouldn't come here:
// childProcess() should exit
}
parentRedirections();
parentProcess();
}
In fork.cc the class'
internal header file fork.ih is
included. This header file takes care of the inclusion of the necessary system
header files, as well as the inclusion of fork.h itself. Its
implementation is:
#include "fork.h"
#include <cstdlib>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
Child processes should not return: once they have completed their tasks
they should terminate. This happens automatically when the child process
performs a call to a member of the
exec...() family, but if the child
itself remains active, then it must make sure that it terminates properly. A
child process normally uses
exit() to terminate itself, but it should be
realized that exit() prevents the activation of destructors of objects
defined at the same or more superficial nesting levels than the level at
which exit() is called. Destructors of globally defined objects are
activated when exit() it used. When using exit() to terminate
childProcess(), it should either itself call a support member function
defining all nested objects it needs, or it should define all its objects in a
compound statement (e.g., a using a throw block) calling exit() beyond
the compound statement.
Parent processes should normally wait for their children to complete. The terminating child processes inform their parent that they are about to terminate by sending out a signal which should be caught by their parents. If child processes terminate and their parent processes do not catch those signal then such child processes remain visible as so-called zombi processes.
If parent processes must wait for their children to complete, they may
call the member waitForChild(). This member returns the exit status of a
child process to its parent.
There exists a situation where the child process continues to
live, but the parent dies. In nature this happens all the time: parents
tend to die before their children do. In our context (i.e. C++), this is
called a
daemon program: the parent process dies and the child program
continues to run as a child of the basic
init process. Again, when the
child eventually dies a signal is sent to its `
step-parent' init. No
zombi is created here, as init catches the termination signals of all its
(step-) children. The construction of a daemon process is very
simple, given the availability of the class Fork (cf. section
20.3.2).
ios::rdbuf() member
function. By assigning the streambuf of a stream to another stream, both
stream objects access the same streambuf, thus realizing redirection at
the level of the programming language itself.
It should be realized that this is fine within the context of the C++
program, but if that context is left, the redirection terminates, as the
operating system does not know about streambuf objects. This happens,
e.g., when a program uses a
system() call to start a subprogram. The
program at the end of this section uses C++ redirection to redirect the
information inserted into
cout to a file, and then calls
system("echo hello world")
to echo a well-known line of text. Since echo writes its information
to the standard output this would be the program's redirected file if
C++'s redirection would be recognized by the operating system.
Actually, this doesn't happen, and hello world still appears at the
program's standard output instead of the redirected file. A solution of this
problem involves redirection at the operating system level, for which some
operating systems (e.g.,
Unix and friends) provide system calls like
dup() and
dup2(). Examples of these system calls are given in section
20.3.3.
Here is the example of the failing redirection at the system level
following C++ redirection using streambuf redirection:
#include <iostream>
#include <fstream>
#include <cstdlib>
using namespace::std;
int main()
{
ofstream of("outfile");
cout.rdbuf(of.rdbuf());
cout << "To the of stream" << endl;
system("echo hello world");
cout << "To the of stream" << endl;
}
/*
Generated output: on the file `outfile'
To the of stream
To the of stream
On standard output:
hello world
*/
fork() is to start a
child process. The parent process terminates immediately after spawning the
child process. If this happens, the child process continues to run as a child
process of
init, the always running first process on
Unix systems. Such
a process is often called a
daemon, running as a
background process.
Although the following example can easily be constructed as a plain C
program, it was included in the C++ Annotations because it is so closely
related to the current discussion of the Fork class. I thought about
adding a daemon() member to that class, but eventually decided against it
because the construction of a daemon program is very simple and requires no
features other than those currently offered by the class Fork. Here is an
example illustrating the construction of a daemon program:
#include <iostream>
#include <unistd.h>
#include "fork.h"
class Daemon: public Fork
{
public:
virtual void parentProcess() // the parent does nothing.
{}
virtual void childProcess()
{
sleep(3); // actions taken by the child
// just a message...
std::cout << "Hello from the child process\n";
exit (0); // The child process exits.
}
};
int main()
{
Daemon daemon;
daemon.fork(); // program immediately returns
return 0;
}
/*
Generated output:
The next command prompt, then after 3 seconds:
Hello from the child process
*/
pipe() system call. When two processes want to communicate
using such file descriptors, the following takes place:
pipe() system call. One of the file descriptors is used for writing, the
other file descriptor is used for reading.
fork() function is called),
duplicating the file descriptors. Now we have four file descriptors as both
the child process and the parent process have their own copies of the two
filedescriptors created by pipe().
Pipe class
constructed here. Let's have a look at its characteristics (before the
implementations can be compiled, the compiler must have read the
class' header file as well as the file unistd.h):
pipe() system call expects a pointer to two int values,
which will represent, respectively, the file descriptors to use for accessing
the reading end and the writing end of the constructed pipe, after
pipe()'s successful completion. To avoid confusion, an enum is defined
associating these ends with symbolic constants. Furthermore, the class stores
the two file descriptors in a data member d_fd. Here is the class header
and its private data:
class Pipe
{
enum RW { READ, WRITE };
int d_fd[2];
pipe() to create a set of associated file descriptors used for
accessing both ends of a pipe:
Pipe::Pipe()
{
if (pipe(d_fd))
throw "Pipe::Pipe(): pipe() failed";
}
readOnly() and readFrom() are used to configure
the pipe's reading end. The latter function is used to set up redirection, by
providing an alternate file descriptor which can be used to read from the
pipe. Usually this alternate file descriptor is
STDIN_FILENO, allowing
cin to extract information from the pipe. The former function is merely
used to configure the reading end of the pipe: it closes the matching writing
end, and returns a file descriptor that can be used to read from the pipe:
int Pipe::readOnly()
{
close(d_fd[WRITE]);
return d_fd[READ];
}
void Pipe::readFrom(int fd)
{
readOnly();
redirect(d_fd[READ], fd);
close(d_fd[READ]);
}
writeOnly() and two writtenBy() members are available to
configure the writing end of a pipe. The former function is merely used to configure the writing end of the
pipe: it closes the matching reading end, and returns a file descriptor that
can be used to write to the pipe:
int Pipe::writeOnly()
{
close(d_fd[READ]);
return d_fd[WRITE];
}
void Pipe::writtenBy(int fd)
{
writtenBy(&fd, 1);
}
void Pipe::writtenBy(int const *fd, unsigned n)
{
writeOnly();
for (int idx = 0; idx < n; idx++)
redirect(d_fd[WRITE], fd[idx]);
close(d_fd[WRITE]);
}
For the latter member two overloaded versions are available:
writtenBy(int fileDescriptor) is used to configure single
redirection, so that a specific file descriptor (usually
STDOUT_FILENO
or
STDERR_FILENO) may be used to write to the pipe;
(writtenBy(int *fileDescriptor, unsigned n = 2)) may be used
to configure multiple redirection, providing an array argument containing
file descriptors. Information written to any of these file descriptors is
actually written into the pipe.
redirect(), which is used
to define a redirection using the
dup2() system call. This function
expects two file descriptors. The first file descriptor represents a file
descriptor which can be used to access the device's information, the second
file descriptor is an alternate file descriptor which may also
be used to access the device's information once dup2() has completed
successfully. Here is redirect()'s implementation:
void Pipe::redirect(int d_fd, int alternateFd)
{
if (dup2(d_fd, alternateFd) < 0)
throw "Pipe: redirection failed";
}
Pipe
objects, we'll now use Fork and Pipe in several demonstration
programs.
ParentSlurp, derived from Fork, starts a child process
which execs a program (like /bin/ls). The (standard) output of the
execed program is then read by the parent process. The parent process will
(for demonstration purposes) write the lines it receives to its standard
output stream, while prepending linenumbers to the received lines. It is most
convenient here to redirect the parents standard input stream, so that the
parent can read the output from the child process from its std::cin
input stream. Therefore, the only pipe that's used is used as an input
pipe at the parent, and an output pipe at the child.
The class ParentSlurp has the following characteristics:
Fork. Before starting ParentSlurp's class
interface, the compiler must have read both fork.h and
pipe.h. Furthermore, the class only uses one data member: a Pipe
object d_pipe:
class ParentSlurp: public Fork
{
Pipe d_pipe;
Pipe's constructor automatically constructs a pipe, and
since d_pipe is automatically constructed by ParentSlurp's default
constructor, there is no need to define ParentSlurp's constructor
explicitly. As no construtor needs to be implemented, all ParentSlurp's
members can be declared as protected members.
childRedirections() member configures the pipe as a pipe for
reading. So, all information written to the child's standard output stream
will end up in the pipe. The big advantage of this all is that no streams
around file descriptors are needed to write to a file descriptor:
virtual void childRedirections()
{
d_pipe.writtenBy(STDOUT_FILENO);
}
parentRedirections() member, configures its end of the pipe
as a reading pipe. It does so by redirecting the reading end of the pipe to
its standard input file descriptor (STDIN_FILENO), thus allowing
extractions from cin instead of using streams built around file
descriptors.
virtual void parentRedirections()
{
d_pipe.readFrom(STDIN_FILENO);
}
childProcess() member only has to concentrate on its own
actions. As it only needs to execure a program (writing information to its
standard output), the member consists of but one statement:
virtual void childRedirections()
{
d_pipe.writtenBy(STDOUT_FILENO);
}
parentProcess() member simply `slurps' the information
appearing at its standard input. Doing so, it actually reads the child's
output. It copies the received lines to its standard output stream after
having prefixed line numbers to them:
void ParentSlurp::parentProcess()
{
std::string line;
unsigned int nr = 1;
while (getline(std::cin, line))
std::cout << nr++ << ": " << line << std::endl;
waitForChild();
}
//
//MAIN
int main()
{
ParentSlurp ps;
ps.fork();
return 0;
}
/*
Generated Output (example only, actually obtained output may differ):
1: a.out
2: bitand.h
3: bitfunctional
4: bitnot.h
5: daemon.cc
6: fdinseek.cc
7: fdinseek.h
...
*/
ParentSlurp object, and
calls its fork() member. Its output consists of a numbered list of files
in the directory where the program is started. Note that the program also
needs the fork.o, pipe.o and waitforchild.o object files (see
earlier sources):
int main()
{
ParentSlurp ps;
ps.fork();
return 0;
}
/*
Generated Output (example only, actually obtained output may differ):
1: a.out
2: bitand.h
3: bitfunctional
4: bitnot.h
5: daemon.cc
6: fdinseek.cc
7: fdinseek.h
...
*/
start will start a new child process. The parent will return the ID
(a number) to the user. The ID may thereupon be used to send a message to that
particular child process
<nr> text will send ``text'' to the child process having ID
<nr>;
stop <nr> will terminate the child process having ID <nr>;
exit will terminate the parent as well as all of its children.
A problem with programs like our monitor is that these programs allow
asynchronous input from multiple sources: input may appear at the
standard input as well as at the input-sides of pipes. Also, multiple output
channels are used. To handle situations like these, the
select() system
call was developed.
20.3.5.1: The class `Select'
The
select() system call was developed to handle asynchronous
I/O multiplexing.
This system call can be used to handle, e.g., input appearing
simultaneously at a set of file descriptors.
The select() system function is rather complex, and its full
discussion is beyond the C++ Annotations' scope.
However, its use may be simplified by providing a class Selector,
hiding its details and offering an easy-to-use public interface. Here its
characteristics are discussed:
Select's members are very small, allowing us to define
most of its members as inline functions. The class requires quite a few data
members. Most of them of types that were specifically constructed for use by
select(). Therefore, before the class interface can be handled by the
compiler, various header files must have been read by it:
#include <limits.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/types.h>
fd_set is a type designed to be used by select() and variables of
this type contain the set of filedescriptors on which select() has sensed
some activity. Furthermore, select() allows us to fire an
asynchroneous alarm. To specify alarm times, the class receives a
timeval data member. The remaining members are used by the class for
internal bookkeeping purposes, illustrated below. Here are the class' header
and data members:
class Selector
{
fd_set d_read;
fd_set d_write;
fd_set d_except;
fd_set d_ret_read;
fd_set d_ret_write;
fd_set d_ret_except;
timeval d_alarm;
int d_max;
int d_ret;
int d_readidx;
int d_writeidx;
int d_exceptidx;
Selector(): the (default) constructor. It
clears the read, write and execute fd_set variables, and switches off the
alarm. Except for d_max the remaining data members do not require
initializations. Here is the implementation of Selector's constructor:
Selector::Selector()
{
FD_ZERO(&d_read);
FD_ZERO(&d_write);
FD_ZERO(&d_except);
noAlarm();
d_max = 0;
}
int wait(): this member function will block()
until activity is sensed at any of the file descriptors monitored by
the Selector object, or if the alarm times out. It will throw an
exception when the select() system call itself fails. Here is wait()'s
implementation:
int Selector::wait()
{
timeval t = d_alarm;
d_ret_read = d_read;
d_ret_write = d_write;
d_ret_except = d_except;
d_readidx = 0;
d_writeidx = 0;
d_exceptidx = 0;
d_ret = select(d_max, &d_ret_read, &d_ret_write, &d_ret_except, &t);
if (d_ret < 0)
throw "Selector::wait()/select() failed";
return d_ret;
}
int nReady: this member function's return value
is defined only when wait() has returned. In that case it returns
0 for a alarm-timeout, -1 if select() failed, and the number of file
descriptors on which activity was sensed otherwise. It can be implemented
inline:
int nReady()
{
return d_ret;
}
int readFd(): this member function's return
value also is defined only after wait() has returned. Its return value is
-1 if no (more) input file descriptors are available. Otherwise the next file
descriptor available for reading is returned. Its inline implementation is:
int readFd()
{
return checkSet(&d_readidx, d_ret_read);
}
int writeFd(): operating analogously to
readFd(), it returns the next file descriptor to which output is written.
Using d_writeidx and d_ret_read, it is implemented analogously to
readFd();
int exceptFd(): operating analogously to
readFd(), it returns the next exception file descriptor on which activity
was sensed. Using d_except_idx and d_ret_except, it is implemented
analogously to readFd();
void setAlarm(int sec, int usec = 0): this
member activate Select's alarm facility. At least the number of seconds to
wait for the alarm to go off must be specified. It simply assigns values to
d_alarm's fields. Then, at the next Select::wait() call, the alarm
will fire (i.e., wait() returns with return value 0) once the configured
alarm-interval has passed. Here is its (inline) implementation:
void setAlarm(int sec, int usec = 0)
{
d_alarm.tv_sec = sec;
d_alarm.tv_usec = usec;
}
void noAlarm(): this member switches off the
alarm, by simply setting the alarm interval to a very long period. Implemented
inline as:
void noAlarm()
{
setAlarm(INT_MAX, INT_MAX);
}
void addReadFd(int fd): this member adds a
file descriptor to the set of input file descriptors monitored by the
Selector object. The member function wait() will return once input is
available at the indicated file descriptor. Here is its inline implementation:
void addReadFd(int fd)
{
addFd(&d_read, fd);
}
void addWriteFd(int fd): this member adds a
file descriptor to the set of output file descriptors monitored by the
Selector object. The member function wait() will return once output is
available at the indicated file descriptor. Using d_write, it is
implemented analogously as addReadFd();
void addExceptFd(int fd): this member adds
a file descriptor to the set of exception file descriptors to be monitored by
the Selector object. The member function wait() will return once
activity is sensed at the indicated file descriptor. Using d_except, it
is implemented analogously as addReadFd();
void rmReadFd(int fd): this member removes a
file descriptor from the set of input file descriptors monitored by the
Selector object. Here is its inline implementation:
void rmReadFd(int fd)
{
FD_CLR(fd, &d_read);
}
void rmWriteFd(int fd): this member removes a
file descriptor from the set of output file descriptors monitored by the
Selector object. Using d_write, it is implemented analogously as
rmReadFd();
void rmExceptFd(int fd): this member removes
a file descriptor from the set of exception file descriptors to be monitored
by the Selector object. Using d_except, it is implemented analogously
as rmReadFd();
private section:
addFd() adds a certain file descriptor to a certain
fd_set. Here is its implementation:
void Selector::addFd(fd_set *set, int fd)
{
FD_SET(fd, set);
if (fd >= d_max)
d_max = fd + 1;
}
checkSet() tests whether a certain file descriptor
(*index) is found in a certain fd_set. Here is its implementation:
int Selector::checkSet(int *index, fd_set &set)
{
int &idx = *index;
while (idx < d_max && !FD_ISSET(idx, &set))
++idx;
return idx == d_max ? -1 : idx++;
}
20.3.5.2: The class `Monitor'
The montor program uses a Monitor object to do most of the
work. The class has only one public constructor and one public member,
run(), to perform its tasks. Therefore, all other member functions
described below should be declared in the class' private section.
Monitor defines a private enum Commands, symbolically listing the
various commands its input language supports, as well as several data members,
among which a Selector object and a map using child order numbers as
its keys, and pointer to Child objects (see section 20.3.5.3) as its
values. Furthermore, Monitor has a static array member s_handler[],
storing pointers to member functions handling user commands.
A destructor should have been implemented too, but its implementation is
left as an exercise to the reader. Before the class interface can be processed
by the compiler, it must have seen select.h and child.h. Here is the
class header, including its private data section:
class Monitor
{
enum Commands
{
UNKNOWN,
START,
EXIT,
STOP,
TEXT
};
static void (Monitor::*s_handler[])(int, std::string const &);
Selector d_selector;
int d_nr;
std::map<int, Child *> d_child;
Since there's only one non-class type data member, the class' constructor is very short and can be implemented inline:
Monitor()
:
d_nr(0)
{}
The core of Monitor's activities are performed by run(). It
performs the following tasks:
waitForChild(). It is installed by run(), and it will wait for the
child's completion. Once the child is completed, it will re-install itself so
that the next termination signal may also be caught. Here is
waitForChild():
void Monitor::waitForChild(int signum)
{
int status;
wait(&status);
signal(SIGCHLD, waitForChild);
}
Monitor object will listen only to its standard
input: the set of input file descriptors to which d_selector will listen
is initialized to STDIN_FILENO.
d_selector's wait() function is called.
If input on cin is available, it is processed by processInput().
Otherwise, the input has arived from a child process. Information sent by
children is processed by processChild().
run()'s implementation:
#include "monitor.ih"
void Monitor::run()
{
signal(SIGCHLD, waitForChild);
d_selector.addReadFd(STDIN_FILENO);
while (true)
{
cout << "? " << flush;
try
{
d_selector.wait();
int fd;
while ((fd = d_selector.readFd()) != -1)
{
if (fd == STDIN_FILENO)
processInput();
else
processChild(fd);
}
}
catch (...)
{
cerr << "select failed, exiting\n";
exiting();
}
}
}
The member function processInput() reads the commands entered by the
user via the program's standard input stream. The member itself is rather
simple: it calls next() to obtain the next command entered by the user,
and then calls the corresponding function via the corresponding element of the
s_handler[] array. This array and the members processInput() and
next() were defined as follows:
void (Monitor::*Monitor::s_handler[])(int, string const &) =
{
&Monitor::unknown, // order follows enum Command's
&Monitor::createNewChild, // elements
&Monitor::exiting,
&Monitor::stopChild,
&Monitor::sendChild,
};
void Monitor::processInput()
{
string line;
int value;
Commands cmd = next(&value, &line);
(this->*s_handler[cmd])(value, line);
}
Monitor::Commands Monitor::next(int *value, string *line)
{
if (!getline(cin, *line))
throw "Command::next(): reading cin failed";
if (*line == "start")
return START;
if (*line == "exit")
return EXIT;
if (line->find("stop") == 0)
{
istringstream istr(line->substr(4));
istr >> *value;
return !istr ? UNKNOWN : STOP;
}
istringstream istr(line->c_str());
istr >> *value;
if (istr)
{
getline(istr, *line);
return TEXT;
}
return UNKNOWN;
}
All other input sensed by d_select has been created by child
processes. Because d_select's readFd() member returns the
corresponding input file descriptor, this descriptor can be passed to
processChild(). Then, using a
ifdstreambuf (see section 20.1.2.1),
its information is read from an input stream. The
communication protocol
used here is rather basic: To every line of input sent to a child, the child
sends exactly one line of text in return. Consequently, processChild()
just has to read one line of text:
void Monitor::processChild(int fd)
{
ifdstreambuf ifdbuf(fd);
istream istr(&ifdbuf);
string line;
getline(istr, line);
cout << d_child[fd]->pid() << ": " << line << endl;
}
Please note the construction d_child[fd]->pid() used in the above
source. Monitor defines the data member map<int, Child *> d_child.
This map contains the child's order number as its key, and a pointer to
the Child object as its value. A pointer is used here, rather than a
Child object, since we do want to use the facilities offered by the map,
but don't want to copy a Child object.
The implication of using pointers as map-values is of course that the
responsibility to destruct the Child object once it becomes superfluous
now lies with the programmer, and not any more with the run-time support
system.
Now that run()'s implementation has been covered, we'll concentrate on
the various commands users might enter:
start command is issued, a new child process is started.
A new element is added to d_child by the member createNewChild().
Next, the Child object should start its activities, but the Monitor
object can not wait here for the child process to complete its activities, as
there is no well-defined endpoint in the near future, and the user will
probably want to enter more commands. Therefore, the Child process
will run as a
daemon: its parent process will terminate immediately, and
its own child process will continue in the background. Consequently,
createNewChild() calls the child's fork() member. Although it is
the child's fork() function that is called, it is still the monitor
program wherein fork() is called. So, the monitor program is
duplicated by fork(). Execution then continues:
Child's parentProcess() in its parent process;
Child's childProcess() in its child process
Child's parentProcess() is an empty function, returning
immediately, the Child's parent process effectively continues immediately
below createNewChild()'s cp->fork() statement. As the child process
never returns (see section 20.3.5.3), the code below cp->fork() is never
executed by the Child's child process. This is exactly as it should be.
In the parent process, createNewChild()'s remaining code simply
adds the file descriptor that's available for reading information from the
child to the set of input file descriptors monitored by d_select, and
uses d_child to establish the association between that
file descriptor and the Child object's address:
void Monitor::createNewChild(int, string const &)
{
Child *cp = new Child(++d_nr);
cp->fork();
int fd = cp->readFd();
d_selector.addReadFd(fd);
d_child[fd] = cp;
cerr << "Child " << d_nr << " started\n";
}
stop <nr>
and <nr> text commands. The former command terminates child process
<nr>, by calling stopChild(). This function locates the child process
having the order number using an anonymous object of the class Find,
nested inside Monitor. The class Find simply compares the
provided nr with the children's order number returned by their nr()
members:
class Find
{
int d_nr;
public:
Find(int nr)
:
d_nr(nr)
{}
bool operator()(std::map<int, Child *>::value_type &vt)
const
{
return d_nr == vt.second->nr();
}
};
If the child process having order number nr was found, its file
descriptor is removed from d_selector's set of input file
descriptors. Then the child process itself is terminated by the static member
killChild(). The member killChild() is declared as a static member
function, as it is used as function argument of the for_each() generic
algorithm by by erase() (see below). Here is killChild()'s
implementation:
void Monitor::killChild(map<int, Child *>::value_type it)
{
if (kill(it.second->pid(), SIGTERM))
cerr << "Couldn't kill process " << it.second->pid() << endl;
}
Having terminated the specified child process, the corresponding Child
object is destroyed and its pointer is removed from d_child:
void Monitor::stopChild(int nr, string const &)
{
map<int, Child *>::iterator it =
find_if(d_child.begin(), d_child.end(), Find(nr));
if (it == d_child.end())
cerr << "No child number " << nr << endl;
else
{
d_selector.rmReadFd(it->second->readFd());
killChild(*it);
delete it->second;
d_child.erase(it);
}
}
<nr> text> will send text to child process
nr, using the member function sendChild(). This function, too, will
use a Find object to locate the process having order number nr, and
will then simply insert the text into the writing end of a pipe connected to
the indicated child process:
void Monitor::sendChild(int nr, string const &line)
{
map<int, Child *>::iterator it =
find_if(d_child.begin(), d_child.end(), Find(nr));
if (it == d_child.end())
cerr << "No child number " << nr << endl;
else
{
ofdnstreambuf ofdn(it->second->writeFd());
ostream out(&ofdn);
out << line << endl;
}
}
exit the member exiting() is called.
It terminates all child processes, by visiting
all elements of d_child, using the
for_each() generic
algorithm (see section 17.4.17). The program is subsequently terminated:
void Monitor::exiting(int, string const &)
{
for_each(d_child.begin(), d_child.end(), killChild);
exit(0);
}
main() function is simply:
#include "monitor.h"
int main()
{
Monitor monitor;
monitor.run();
}
/*
Example of a session:
# a.out
? start
Child 1 started
? 1 hello world
? 3394: Child 1:1: hello world
? 1 hi there!
? 3394: Child 1:2: hi there!
? start
Child 2 started
? 3394: Child 1: standing by
? 3395: Child 2: standing by
? 3394: Child 1: standing by
? 3395: Child 2: standing by
? stop 1
? 3395: Child 2: standing by
? 2 hello world
? 3395: Child 2:1: hello world
? 1 hello world
No child number 1
? exit3395: Child 2: standing by
?
#
*/
20.3.5.3: The class `Child'
When the Monitor object starts a child process, it has to create an
object of the class Child. The Child class is derived from the class
Fork, allowing its construction as a
daemon, as discussed in the
previous section. Since a Child object is a daemon, we know that its
parent process should be defined as an empty function. its childProcess()
must of course still be defined. Here are the characteristics of the class
Child:
Child class defines two Pipe data members, to allow
communications between its own child- and parent processes. As these pipes are
used by the Child's child process, their names are aimed at the child
process: the child process reads from d_in, and writes to d_out. Here
are Child class' header and its private data:
class Child: public Fork
{
Pipe d_in;
Pipe d_out;
int d_parentReadFd;
int d_parentWriteFd;
int d_nr;
Child's constructor simply stores its argument, a
child-process order number, in its own d_nr data member:
Child(int nr)
:
d_nr(nr)
{}
Child's child process will simply obtain its information from
its standard input stream, and it will write its information to its standard
output stream. Since the communication channels are pipes, redirections must
be configured. Here is the implementation of the childRedirections()
member:
void Child::childRedirections()
{
d_in.readFrom(STDIN_FILENO);
d_out.writtenBy(STDOUT_FILENO);
}
d_in is used for writing by the parent, and
d_out is used for reading by the parent. Here is the implementation of
parentRedirections():
void Child::parentRedirections()
{
d_parentReadFd = d_out.readOnly();
d_parentWriteFd = d_in.writeOnly();
}
Child object will exist until it is destroyed by the
Monitor's stopChild() member. By allowing its creator, the Monitor
object, to access the parent-side ends of the pipes, the Monitor object
can communicate with the Child's child process via those pipe-ends. The
members readFd() and writeFd() allow the Monitor object to access
these pipe-ends:
int readFd()
{
return d_parentReadFd;
}
int writeFd()
{
return d_parentWriteFd;
}
Child object's child process basically has two tasks to
perform:
childProcess() defines a local
Selector object, adding STDIN_FILENO to its set of monitored input
file descriptors.
Then, in an eternal loop childProcess() waits for selector.wait()
to return. When the alarm goes off, it sends a message to its standard output
(hence, into the writing pipe). Otherwise it will echo the messages appearing
at its standard input to its standard output. Here is the implementation of
the childProcess() member:
void Child::childProcess()
{
Selector selector;
unsigned message = 0;
selector.addReadFd(STDIN_FILENO);
selector.setAlarm(5);
while (true)
{
try
{
if (!selector.wait()) // timeout
cout << "Child " << d_nr << ": standing by\n";
else
{
string line;
getline(cin, line);
cout << "Child " << d_nr << ":" << ++message << ": " <<
line << endl;
}
}
catch (...)
{
cout << "Child " << d_nr << ":" << ++message << ": " <<
"select() failed" << endl;
}
}
exit(0);
}
Monitor object to access the
Child's process ID and order number, respectively:
int pid()
{
return Fork::pid();
}
int nr()
{
return d_nr;
}
Some operators appear to be missing: there appear to be no predefined
function objects corresponding to
bitwise operations. However, their
construction is, given the available predefined function objects, not
difficult. The following examples show a
template class implementing a
function object calling the
bitwise and (
operator&()), and a template
class implementing a function object calling the
unary not
(
operator~()). It is left to the reader tp construct similar function
objects for other operators.
Here is the implementation of a function object calling the
bitwise
operator&():
#include <functional>
template <typename _Tp>
struct bit_and: public std::binary_function<_Tp,_Tp,_Tp>
{
_Tp operator()(const _Tp& __x, const _Tp& __y) const
{
return __x & __y;
}
};
Here is the implementation of a function object calling operator~():
#include <functional>
template <typename _Tp>
struct bit_not: public std::unary_function<_Tp,_Tp>
{
_Tp operator()(const _Tp& __x) const
{
return ~__x;
}
};
These and other
missing predefined function objects
are also implemented in the
file
bitfunctional, which is found in the cplusplus.yo.zip archive.
Here is an example using bit_and() removing all odd numbers from a
vector of int values:
#include <iostream>
#include <algorithm>
#include <vector>
#include "bitand.h"
using namespace std;
int main()
{
vector<int> vi;
for (int idx = 0; idx < 10; ++idx)
vi.push_back(idx);
copy
(
vi.begin(),
remove_if(vi.begin(), vi.end(), bind2nd(bit_and<int>(), 1)),
ostream_iterator<int>(cout, " ")
);
cout << endl;
}
/*
Generated output:
0 2 4 6 8
*/
An object of this nested iterator class handled the dereferencing of the pointers stored in the vector. This allowed us to sort the strings pointed to by the vector's elements rather than the pointers.
A drawback of the approach taken in section 19.11.1 is that the class implementing the iterator is closely tied to the derived class as the iterator class was implemented as a nested class. What if we would like to provide any class derived from a container class storing pointers with an iterator handling the pointer-dereferencing?
In this section a variant to the earlier (nested class) approach is discussed. The iterator class will be defined as a template class, parameterizing the data type to which the container's elements point as well as the iterator type of the container itself. Once again we will implement a RandomIterator as it is the most complex iterator type.
Our class is named RandomPtrIterator, indicating that it is a random
iterator operating on pointer values. The template class defines three
template type parameters:
Class). Like the earlier nested class, RandomPtrIterator's
constructor will be private. To allow client classes to construct
RandomPtrIterators we will therefore need friend
declarations. However, a friend class Class cannot be defined: template
parameter types cannot be used in friend class ... declarations. But this
is no big problem: not every member of the client class needs to construct
iterators. In fact, only Class's begin() and end() members must be
able to construct iterators. Using the template's first parameter friend
declarations can be specified for the client's begin() and end()
members.
BaseIterator);
Type).
RandomPtrIterator uses one private data
element, a BaseIterator. Here is the initial section, including the
constructor, of the class RandomPtrIterator:
#include <iterator>
template <typename Class, typename BaseIterator, typename Type>
class RandomPtrIterator:
public std::iterator<std::random_access_iterator_tag, Type>
{
friend RandomPtrIterator<Class, BaseIterator, Type> Class::begin();
friend RandomPtrIterator<Class, BaseIterator, Type> Class::end();
BaseIterator d_current;
RandomPtrIterator(BaseIterator const ¤t)
:
d_current(current)
{}
Dissecting its friend declarations, we see that the members
begin() and end() of a class Class, returning a
RandomPtrIterator object for the types Class, BaseIterator and
Type are granted access to RandomPtrIterator's private constructor.
That is exactly what we want. Note that begin() and end() are declared
as
bound friends.
All RandomPtrIterator's remaining members are public. Since
RandomPtrIterator is just a generalization of the nested class
iterator developed in section 19.11.1, re-implementing the required
member functions is easy, and only requires us to change iterator into
RandomPtrIterator and to change std::string into Type. For
example, operator<(), defined in the class iterator as
bool operator<(iterator const &other) const
{
return **d_current < **other.d_current;
}
is re-implemented as:
bool operator<(RandomPtrIterator const &other) const
{
return **d_current < **other.d_current;
}
As a second example: operator*(), defined in the class
iterator as
std::string &operator*() const
{
return **d_current;
}
is re-implemented as:
Type &operator*() const
{
return **d_current;
}
Reimplementing the class StringPtr developed in section 19.11.1
is not difficult either. Apart from including the header file defining the
template class RandomPtrIterator, it requires only a single modification
as its iterator typedef must now be associated with a
RandomPtrIterator:
typedef RandomPtrIterator
<
StringPtr,
std::vector<std::string *>::iterator,
std::string
>
iterator;
Including StringPtr's modified header file into the program given in
section 19.11.2 will result in a program hehaving identically to its
earlier version, albeit that StringPtr::begin() and StringPtr::end()
now return iterator objects constructed from a template definition.
atoi(),
atol(), and other functions, which can be used to convert
ASCII-Z
strings to numerical values. In C++, these functions are still available,
but a more type safe way to convert text to other types is by using
objects of the class std::istringsteam.
Using the std::istringstream class instead of the C standard
conversion functions may have the advantage of type-safety, but it also
appears to be a rather cumbersome alternative. After all, we will have to
construct and initialize a std::istringstream object first, before we're
actually able to extract a value of some type from it. This requires us to use
a a variable. Then, if the extracted value is actually only needed to
initialize some function-parameter, one might wonder whether the additional
variable and the istringstream construction can somehow be avoided.
In this section we'll develop a class (
A2x) preventing all the
disadvantages of the standard C library functions, without requiring the
cumbersome definitions of std::istringstream objects over and over
again. The class is called A2x for
`
ascii to anything'.
A2x objects can be used to obtain a value for any type extractable
from std::istream objects given its textual representation. Since A2x
represents the object-variant of the C functions, it is not only type-safe
but also extensible. Consequently, their use is greatly preferred over the
standard C functions. Here are its characteristics:
A2x is derived from std::istringstream, so all members
of the class std::istringstream are available. Thus, extractions of values
of variables can always be performed effortlessly:
class A2x: public std::istringstream
A2x has a default constructor and a constructor expecting a
std::string argument. The latter constructor may be used to initialize
A2x objects with text to be converted (e.g., a line of text obtained from
reading a configuration file):
A2x()
{}
A2x(std::string const &str)
:
std::istringstream(str)
{}
A2x's real strength comes from its operator Type() conversion
member template. As it is a member template, it will automatically adapt
itself to the type of the variable that should be given a value, obtained by
converting the text stored inside the A2x object to the variable's
type. When the extraction fails, A2x's inherited good() member will
return false:
template <typename Type>
operator Type()
{
Type t;
return (*this >> t) ? t : Type();
}
A2x.operator int<int>();
// or just:
A2x.operator int();
Since neither syntax looks attractive, the member template
to() was provided as well, allowing constructions like:
A2x.to(int());
Here is its implementation:
template <typename Type>
Type to(Type const&)
{
return *this;
}
A2x object is available, it may be reinitialized using
its operator=() member:
#include "a2x.h"
A2x &A2x::operator=(std::string const &str)
{
clear(); // very important!!! If a conversion failed, the object
// remains useless until executing this statement
str(txt);
return *this;
}
Here are some examples of its use:
int x = A2x("12"); // initialize int x from a string "12"
A2x a2x("12.50"); // explicitly create an A2x object
double d;
d = a2x; // assign a variable using an A2x object
a2x = "err";
d = a2x; // d is 0: the conversion failed,
// and a2x.good() == false
a2x = " a"; // reassign a2x to new text
char c = a2x; // c now 'a': internally operator>>() is used
// so initial blanks are skipped.
extern expectsInt(int x); // initialize a parameter using an
expectsInt(A2x("1200")); // anonymous A2x object
d = A2x("12.45").to(int()); // d is 12, not 12.45
Apart from a class A2x a complementary class (
X2a) can easily be
constructed as well. The construction of X2a is left as an exercise to
the reader.
operator()()) of function objects that are passed as arguments to
the generic algorithms.
Usually this approach requires the construction of a dedicated class
implementing the required function object. However, in many cases the class
context in which the iterators exist already offers the required
functionality. Alternatively, the functionality might exist as member function
of the objects to which the iterators refer. For example, finding the first
empty string object in a vector of string objects could profitably use
the string::empty() member.
Another frequently encountered situation is related to a
local context. Once again, consider the situation where the elements
of a string vector are all visited: each object must be inserted in a
stream whose reference is only known to the function in which the string
elements are visited, but some additional information must be passed to the
insertion function as well, making the use of the ostream_inserter less
appropriate.
The frustrating part of using generic algorithms is that these dedicated
function objects often very much look like each other, but the standard
solution (using predefined function objects, using specialized iterators)
seldomly do the required job: their fixed function interfaces (e.g.,
equal_to calling the object's operator==()) often are too rigid to be
useful and, furthermore, they are unable to use any additional local
context that is active when they are used.
Nevertheless, one may wonder whether template classes might be constructed which can be used again and again to create dedicated function objects. Such template class instantiations should offer facilities to call configurable (member) functions, using a configurable local context.
In the upcoming sections several wrapper templates supporting these requirements are developed. To support a local context a dedicated local context struct is introduced. Furthermore, the wrapper templates will allow us to specify the member function that should be called in its constructor. Thus the rigidness of the fixed member function as used in the predefined function objects is avoided.
As an example of a generic algorithm usually requiring a simple function
object consider for_each(). The operator()() of the function object
passed to this algorithm receives as its argument a reference to the object to
which the iterators refer. Generally, the operator()() will do one of two
things:
operator()(string &str) may call str.length());
somefunction(str)).
somefunction()'s address could actually directly have been passed to the
generic algorithm, so why use this complex procedure? The answer is
context: if somefunction() would actually require other arguments,
representing the local context in which somefunction() was called, then
the function object's constructor could have received the local context as its
arguments, passing that local context on to somefunction(), together with
the object received by the function object's operator()() function. There
is no way to pass any local context to the generic algorithm's simple variant,
in which a function's address is passed to the generic function.
At first sight, however, the fact that a local context differs from one
situation to another makes it hard to standardize the local context: a local
context might consist of values, pointers, references, which differ in number
and types from one situation to another. Defining templates for all possible
situations is clearly impractical, and using variadic functions is also not
very attractive, since the arguments passed to a variadic function object
constructor cannot simply be passed on to the function object's
operator()().
The concept of a local context struct is introduced to standardize the local context. It is based on the following considerations:
const & to a local context struct.
struct is defined in the function's class
interface.
struct is
initialized, which is then passed as argument to the function.
structs will differ from
one situation to the next situation, but there is always just one local
context required. The fact that the inner organization of the local context
differs from one situation to the next causes no difficulty at all to
C++'s template mechanism. Actually, having available a generic type
(Context) together with several concrete instantiations of that
generic type is a mere text-book argument for using templates.
show() may expect two arguments: an
ostream & into which the inf