Chapter Thirteen (Part 7)
|Table of Content||
Chapter Thirteen (Part 9)
MS-DOS PC-BIOS AND FILE I/O (Part 8)
- Blocked File I/O
13.3.11 - The Program Segment Prefix (PSP)
The examples in the previous section suffer from a major drawback they are extremely slow. The performance problems with the code above are entirely due to DOS. Making a DOS call is not shall we say the fastest operation in the world. Calling DOS every time we want to read or write a single character from/to a file will bring the system to its knees. As it turns out it doesn't take (practically) any more time to have DOS read or write two characters than it does to read or write one character. Since the amount of time we (usually) spend processing the data is negligible compared to the amount of time DOS takes to return or write the data reading two characters at a time will essentially double the speed of the program. If reading two characters doubles the processing speed how about reading four characters? Sure enough it almost quadruples the processing speed. Likewise processing ten characters at a time almost increases the processing speed by an order of magnitude. Alas this progression doesn't continue forever. There comes a point of diminishing returns- when it takes far too much memory to justify a (very) small improvement in performance (keeping in mind that reading 64K in a single operation requires a 64K memory buffer to hold the data). A good compromise is 256 or 512 bytes. Reading more data doesn't really improve the performance much yet a 256 or 512 byte buffer is easier to deal with than larger buffers.
Reading data in groups or blocks is called blocked I/O. Blocked I/O is often one to two orders of magnitude faster than single character I/O so obviously you should use blocked I/O whenever possible.
There is one minor drawback to blocked I/O-- it's a little more complex to program than single character I/O. Consider the example presented in the section on the DOS read command:
Example: This example opens a file and reads it to the EOF
mov ah 3dh ;Open the file mov al 0 ;Open for reading lea dx Filename ;Presume DS points at filename int 21h ; segment jc BadOpen mov FHndl ax ;Save file handle LP: mov ah 3fh ;Read data from the file lea dx Buffer ;Address of data buffer mov cx 1 ;Read one byte mov bx FHndl ;Get file handle value int 21h jc ReadError cmp ax cx ;EOF reached? jne EOF mov al Buffer ;Get character read putc ;Print it (IOSHELL call) jmp LP ;Read next byte EOF: mov bx FHndl mov ah 3eh ;Close file int 21h jc CloseError
There isn't much to this program at all. Now consider the same example rewritten to use blocked I/O:
Example: This example opens a file and reads it to the EOF using blocked I/O
mov ah 3dh ;Open the file mov al 0 ;Open for reading lea dx Filename ;Presume DS points at filename int 21h ; segment jc BadOpen mov FHndl ax ;Save file handle LP: mov ah 3fh ;Read data from the file lea dx Buffer ;Address of data buffer mov cx 256 ;Read 256 bytes mov bx FHndl ;Get file handle value int 21h jc ReadError cmp ax cx ;EOF reached? jne EOF mov si 0 ;Note: CX=256 at this point. PrtLp: mov al Buffer[si] ;Get character read putc ;Print it inc si loop PrtLp jmp LP ;Read next block ; Note just because the number of bytes read doesn't equal 256 ; don't get the idea we're through there could be up to 255 bytes ; in the buffer still waiting to be processed. EOF: mov cx ax jcxz EOF2 ;If CX is zero we're really done. mov si 0 ;Process the last block of data read Finis: mov al Buffer[si] ; from the file which contains putc ; 1..255 bytes of valid data. inc si loop Finis EOF2: mov bx FHndl mov ah 3eh ;Close file int 21h jc CloseError
This example demonstrates one major hassle with blocked I/O - when you reach the end of file you haven't necessarily processed all of the data in the file. If the block size is 256 and there are 255 bytes left in the file DOS will return an EOF condition (the number of bytes read don't match the request). In this case we've still got to process the characters that were read. The code above does this in a rather straight-forward manner using a second loop to finish up when the EOF is reached. You've probably noticed that the two print loops are virtually identical. This program can be reduced in size somewhat using the following code which is only a little more complex:
Example: This example opens a file and reads it to the EOF using blocked I/O
mov ah 3dh ;Open the file mov al 0 ;Open for reading lea dx Filename ;Presume DS points at filename int 21h ; segment. jc BadOpen mov FHndl ax ;Save file handle LP: mov ah 3fh ;Read data from the file lea dx Buffer ;Address of data buffer mov cx 256 ;Read 256 bytes mov bx FHndl ;Get file handle value int 21h jc ReadError mov bx ax ;Save for later mov cx ax jcxz EOF mov si 0 ;Note: CX=256 at this point. PrtLp: mov al Buffer[si] ;Get character read putc ;Print it inc si loop PrtLp cmp bx 256 ;Reach EOF yet? je LP EOF: mov bx FHndl mov ah 3eh ;Close file int 21h jc CloseError
Blocked I/O works best on sequential files. That is those files opened only for reading or writing (no seeking). When dealing with random access files you should read or write whole records at one time using the DOS read/write commands to process the whole record. This is still considerably faster than manipulating the data one byte at a time.
13.3.11 The Program Segment Prefix (PSP)
When a program is loaded into memory for execution DOS first builds up a program segment prefix immediately before the program is loaded into memory. This PSP contains lots of information some of it useful some of it obsolete. Understanding the layout of the PSP is essential for programmers designing assembly language programs.
The PSP is 256 bytes long and contains the following information:
Offset Length Description 0 2 An INT 20h instruction is stored here 2 2 Program ending address 4 1 Unused reserved by DOS 5 5 Call to DOS function dispatcher 0Ah 4 Address of program termination code 0Eh 4 Address of break handler routine 12h 4 Address of critical error handler routine 16h 22 Reserved for use by DOS 2Ch 2 Segment address of environment area 2Eh 34 Reserved by DOS 50h 3 INT 21h RETF instructions 53h 9 Reserved by DOS 5Ch 16 Default FCB #1 6Ch 20 Default FCB #2 80h 1 Length of command line string 81h 127 Command line string
Note: locations 80h..FFh are used for the default DTA.
Most of the information in the PSP is of little use to a modern MS-DOS assembly language program. Buried in the PSP however are a couple of gems that are worth knowing about. Just for completeness however we'll take a look at all of the fields in the PSP.
The first field in the PSP contains an
int 20h instruction.
Int 20h is an obsolete mechanism used to terminate program execution. Back in
the early days of DOS v1.0
your program would execute a
jmp to this location
in order to terminate. Nowadays
we have DOS function 4Ch which is much easier
(and safer) than jumping to location zero in the PSP. Therefore
this field is obsolete.
Field number two contains a value which points at the last paragraph allocated to your program By subtracting the address of the PSP from this value you can determine the amount of memory allocated to your program (and quit if there is insufficient memory available).
The third field is the first of many "holes" left in the PSP by Microsoft. Why they're here is anyone's guess.
The fourth field is a call to the DOS function dispatcher. The purpose of this (now obsolete) DOS calling mechanism was to allow some additional compatibility with CP/M-80 programs. For modern DOS programs there is absolutely no need to worry about this field.
The next three fields are used to store special addresses
during the execution of a program. These fields contain the default terminate vector
and critical error handler vectors. These are the values normally stored in
the interrupt vectors for
By storing a copy of the values in the vectors for these interrupts
you can change these
vectors so that they point into your own code. When your program terminates
those three vectors from these three fields in the PSP. For more details on these
please consult the DOS technical reference manual.
The eighth field in the PSP record is another reserved field currently unavailable for use by your programs.
The ninth field is another real gem. It's the address of the environment strings area. This is a two-byte pointer which contains the segment address of the environment storage area. The environment strings always begin with an offset zero within this segment. The environment string area consists of a sequence of zero-terminated strings. It uses the following format:
string1 0 string2 0 string3 0 ... 0 stringn 0 0
That is the environment area consists of a list of zero terminated strings the list itself being terminated by a string of length zero (i.e. a zero all by itself or two zeros in a row however you want to look at it). Strings are (usually) placed in the environment area via DOS commands like PATH SET etc. Generally a string in the environment area takes the form
name = parameters
For example the "SET IPATH=C:\ASSEMBLY\INCLUDE" command copies the string "IPATH=C:\ASSEMBLY\INCLUDE" into the environment string storage area.
Many languages scan the environment storage area to find default filename paths and other pieces of default information set up by DOS. Your programs can take advantage of this as well.
The next field in the PSP is another block of reserved storage currently undefined by DOS.
The 11th field in the PSP is another call to the DOS function dispatcher. Why this call exists (when the one at location 5 in the PSP already exists and nobody really uses either mechanism to call DOS) is an interesting question. In general this field should be ignored by your programs.
The 12th field is another block of unused bytes in the PSP which should be ignored.
The 13th and 14th fields in the PSP are the default FCBs (File Control Blocks). File control blocks are another archaic data structure carried over from CP/M-80. FCBs are used only with the obsolete DOS v1.0 file handling routines so they are of little interest to us. We'll ignore these FCBs in the PSP.
Locations 80h through the end of the PSP contain a very important piece of information- the command line parameters typed on the DOS command line along with your program's name. If the following is typed on the DOS command line:
MYPGM parameter1 parameter2
the following is stored into the command line parameter field:
23 " parameter1 parameter2" 0Dh
Location 80h contains 2310 the length of the parameters following the program name. Locations 81h through 97h contain the characters making up the parameter string. Location 98h contains a carriage return. Notice that the carriage return character is not figured into the length of the command line string.
Processing the command line string is such an important facet of assembly language programming that this process will be discussed in detail in the next section.
Locations 80h..FFh in the PSP also comprise the default DTA. Therefore if you don't use DOS function 1Ah to change the DTA and you execute a FIND FIRST FILE the filename information will be stored starting at location 80h in the PSP.
One important detail we've omitted until now is exactly how
you access data in the PSP. Although the PSP is loaded into memory immediately before your
that doesn't necessarily mean that it appears 100h bytes before your code. Your
data segments may have been loaded into memory before your code segments
invalidating this method of locating the PSP. The segment address of the PSP is passed to
your program in the
ds register. To store the PSP address away in your data
your programs should begin with the following code:
push ds ;Save PSP value mov ax seg DSEG ;Point DS and ES at our data mov ds ax ; segment. mov es ax pop PSP ;Store PSP value into "PSP" ; variable. . . .
Another way to obtain the PSP address
in DOS 5.0 and
is to make a DOS call. If you load
ah with 51h and execute an
MS-DOS will return the segment address of the current PSP in the
There are lots of tricky things you can do with the data in the PSP. Peter Norton's Programmer's Guide to the IBM PC lists all kinds of tricks. Such operations won't be discussed here because they're a little beyond the scope of this manual.
Chapter Thirteen: MS-DOS
File I/O (Part 8)
28 SEP 1996