The Art of
ASSEMBLY LANGUAGE PROGRAMMING

Chapter Ten (Part 4)

Table of Content

Chapter Ten (Part 6)

CHAPTER TEN:
CONTROL STRUCTURES (Part 5)
10.9 - Nested Statements
10.10 - Timing Delay Loops
10.9 Nested Statements

As long as you stick to the templates provides in the examples presented in this chapter it is very easy to nest statements inside one another. The secret to making sure your assembly language sequences nest well is to ensure that each construct has one entry point and one exit point. If this is the case then you will find it easy to combine statements. All of the statements discussed in this chapter follow this rule.

Perhaps the most commonly nested statements are the if..then..else statements. To see how easy it is to nest these statements in assembly language consider the following Pascal code:

        if (x = y) then
if (I >= J) then writeln('At point 1')
else writeln('At point 2)
else write('Error condition');

To convert this nested if..then..else to assembly language start with the outermost if convert it to assembly then work on the innermost if:

; if (x = y) then

mov     ax
X
cmp     ax
Y
jne     Else0

; Put innermost IF here

jmp     IfDone0

; Else write('Error condition');

Else0:          print
byte    "Error condition"
0
IfDone0:

As you can see the above code handles the "if (X=Y)..." instruction leaving a spot for the second if. Now add in the second if as follows:

; if (x = y) then

mov     ax
X
cmp     ax
Y
jne     Else0

;       IF ( I >= J) then writeln('At point 1')

mov     ax
I
cmp     ax
J
jnge    Else1
print
byte    "At point 1"
cr
lf
0
jmp     IfDone1

;       Else writeln ('At point 2');

Else1:          print
byte    "At point 2"
cr
lf
0
IfDone1:

jmp     IfDone0

; Else write('Error condition');

Else0:          print
byte    "Error condition"
0
IfDone0:

The nested if appears in italics above just to help it stand out.

There is an obvious optimization which you do not really want to make until speed becomes a real problem. Note in the innermost if statement above that the JMP IFDONE1 instructions simply jumps to a jmp instruction which transfers control to IfDone0. It is very tempting to replace the first jmp by one which jumps directly to IFDone0. Indeed when you go in and optimize your code this would be a good optimization to make. However you shouldn't make such optimizations to your code unless you really need the speed. Doing so makes your code harder to read and understand. Remember we would like all our control structures to have one entry and one exit. Changing this jump as described would give the innermost if statement two exit points.

The for loop is another commonly nested control structure. Once again the key to building up nested structures is to construct the outside object first and fill in the inner members afterwards. As an example consider the following nested for loops which add the elements of a pair of two dimensional arrays together:

        for i := 0 to 7 do
for k := 0 to 7 do
A [i
j] := B [i
j] + C [i
j];

As before begin by constructing the outermost loop first. This code assumes that dx will be the loop control variable for the outermost loop (that is dx is equivalent to "i"):

; for dx := 0 to 7 do

mov     dx
0
ForLp0:         cmp     dx
7
jnle    EndFor0

; Put innermost FOR loop here

inc     dx
jmp     ForLp0
EndFor0:

Now add the code for the nested for loop. Note the use of the cx register for the loop control variable on the innermost for loop of this code.

; for dx := 0 to 7 do

mov     dx
0
ForLp0:         cmp     dx
7
jnle    EndFor0

;       for cx := 0 to 7 do

                mov     cx
0
ForLp1:         cmp     cx
7
jnle    EndFor1

; Put code for A[dx
cx] := b[dx
cx] + C [dx
cx] here

inc     cx
jmp     ForLp1
EndFor1:

inc     dx
jmp     ForLp0
EndFor0:

Once again the innermost for loop is in italics in the above code to make it stand out. The final step is to add the code which performs that actual computation.

10.10 Timing Delay Loops

Most of the time the computer runs too slow for most people's tastes. However there are occasions when it actually runs too fast. One common solution is to create an empty loop to waste a small amount of time. In Pascal you will commonly see loops like:

	for i := 1 to 10000 do ;

In assembly you might see a comparable loop:

                mov     cx
8000h
DelayLp:        loop    DelayLp

By carefully choosing the number of iterations you can obtain a relatively accurate delay interval. There is however one catch. That relatively accurate delay interval is only going to be accurate on your machine. If you move your program to a different machine with a different CPU clock speed number of wait states different sized cache or half a dozen other features you will find that your delay loop takes a completely different amount of time. Since there is better than a hundred to one difference in speed between the high end and low end PCs today it should come as no surprise that the loop above will execute 100 times faster on some machines than on others.

The fact that one CPU runs 100 times faster than another does not reduce the need to have a delay loop which executes some fixed amount of time. Indeed it makes the problem that much more important. Fortunately the PC provides a hardware based timer which operates at the same speed regardless of the CPU speed. This timer maintains the time of day for the operating system so it's very important that it run at the same speed whether you're on an 8088 or a Pentium. In the chapter on interrupts you will learn to actually patch into this device to perform various tasks. For now we will simply take advantage of the fact that this timer chip forces the CPU to increment a 32-bit memory location (40:6ch) about 18.2 times per second. By looking at this variable we can determine the speed of the CPU and adjust the count value for an empty loop accordingly.

The basic idea of the following code is to watch the BIOS timer variable until it changes. Once it changes start counting the number of iterations through some sort of loop until the BIOS timer variable changes again. Having noted the number of iterations if you execute a similar loop the same number of times it should require about 1/18.2 seconds to execute.

The following program demonstrates how to create such a Delay routine:

                .xlist
include                 stdlib.a
includelib              stdlib.lib
.list

; PPI_B is the I/O address of the keyboard/speaker control
; port. This program accesses it simply to introduce a
; large number of wait states on faster machines. Since the
; PPI (Programmable Peripheral Interface) chip runs at about
; the same speed on all PCs
accessing this chip slows most
; machines down to within a factor of two of the slower
; machines.

PPI_B           equ     61h

; RTC is the address of the BIOS timer variable (40:6ch).
; The BIOS timer interrupt code increments this 32-bit
; location about every 55 ms (1/18.2 seconds). The code
; which initializes everything for the Delay routine
; reads this location to determine when 1/18th seconds
; have passed.

RTC             textequ <es:[6ch]>

dseg            segment para public 'data'

; TimedValue contains the number of iterations the delay
; loop must repeat in order to waste 1/18.2 seconds.

TimedValue      word    0

; RTC2 is a dummy variable used by the Delay routine to
; simulate accessing a BIOS variable.

RTC2            word    0


dseg            ends



cseg            segment para public 'code'
assume  cs:cseg
ds:dseg

; Main program which tests out the DELAY subroutine.

Main            proc
mov     ax
dseg
mov     ds
ax

print
byte    "Delay test routine"
cr
lf
0

; Okay
let's see how long it takes to count down 1/18th
; of a second. First
point ES as segment 40h in memory.
; The BIOS variables are all in segment 40h.
;
; This code begins by reading the memory timer variable
; and waiting until it changes. Once it changes we can
; begin timing until the next change occurs. That will
; give us 1/18.2 seconds. We cannot start timing right
; away because we might be in the middle of a 1/18.2
; second period.

mov     ax
40h
mov     es
ax
mov     ax
RTC
RTCMustChange:  cmp     ax
RTC
je      RTCMustChange

; Okay
begin timing the number of iterations it takes
; for an 18th of a second to pass. Note that this
; code must be very similar to the code in the Delay
; routine.

mov     cx
0
mov     si
RTC
mov     dx
PPI_B
TimeRTC:        mov     bx
10
DelayLp:        in      al
dx
dec     bx
jne     DelayLp
cmp     si
RTC
loope   TimeRTC

neg     cx                      ;CX counted down!
mov     TimedValue
cx          ;Save away

mov     ax
ds
mov     es
ax

printf
byte    "TimedValue = %d"
cr
lf
byte    "Press any key to continue"
cr
lf
byte    "This will begin a delay of five "
byte    "seconds"
cr
lf
0
dword   TimedValue

getc

mov     cx
90
DelayIt:        call    Delay18
loop    DelayIt

Quit:           ExitPgm ;DOS macro to quit program.
Main            endp

; Delay18-This routine delays for approximately 1/18th sec.
;        Presumably
the variable "TimedValue" in DS has
;        been initialized with an appropriate count down
;        value before calling this code.

Delay18         proc    near
push    ds
push    es
push    ax
push    bx
push    cx
push    dx
push    si

mov     ax
dseg
mov     es
ax
mov     ds
ax

; The following code contains two loops. The inside
; nested loop repeats 10 times. The outside loop
; repeats the number of times determined to waste
; 1/18.2 seconds. This loop accesses the hardware
; port "PPI_B" in order to introduce many wait states
; on the faster processors. This helps even out the
; timings on very fast machines by slowing them down.
; Note that accessing PPI_B is only done to introduce
; these wait states
the data read is of no interest
; to this code.
;
; Note the similarity of this code to the code in the
; main program which initializes the TimedValue variable.

mov     cx
TimedValue
mov     si
es:RTC2
mov     dx
PPI_B

TimeRTC:        mov     bx
10
DelayLp:        in      al
dx
dec     bx
jne     DelayLp
cmp     si
es:RTC2
loope   TimeRTC

pop     si
pop     dx
pop     cx
pop     bx
pop     ax
pop     es
pop     ds
ret
Delay18         endp

cseg            ends

sseg            segment para stack 'stack'
stk             word    1024 dup (0)
sseg            ends
end     Main

Chapter Ten (Part 4)

Table of Content

Chapter Ten (Part 6)

Chapter Ten: Control Structures (Part 5)
27 SEP 1996