CONTROL STRUCTURES (Part 5)
10.10 - Timing Delay Loops
|10.9 Nested Statements|
As long as you stick to the templates provides in the examples presented in this chapter it is very easy to nest statements inside one another. The secret to making sure your assembly language sequences nest well is to ensure that each construct has one entry point and one exit point. If this is the case then you will find it easy to combine statements. All of the statements discussed in this chapter follow this rule.
Perhaps the most commonly nested statements are the if..then..else statements. To see how easy it is to nest these statements in assembly language consider the following Pascal code:
if (x = y) then if (I >= J) then writeln('At point 1') else writeln('At point 2) else write('Error condition');
To convert this nested if..then..else to assembly language start with the outermost if convert it to assembly then work on the innermost if:
; if (x = y) then mov ax X cmp ax Y jne Else0 ; Put innermost IF here jmp IfDone0 ; Else write('Error condition'); Else0: print byte "Error condition" 0 IfDone0:
As you can see the above code handles the "if (X=Y)..." instruction leaving a spot for the second if. Now add in the second if as follows:
; if (x = y) then mov ax X cmp ax Y jne Else0 ; IF ( I >= J) then writeln('At point 1') mov ax I cmp ax J jnge Else1 print byte "At point 1" cr lf 0 jmp IfDone1 ; Else writeln ('At point 2'); Else1: print byte "At point 2" cr lf 0 IfDone1: jmp IfDone0 ; Else write('Error condition'); Else0: print byte "Error condition" 0 IfDone0:
if appears in italics above just to
help it stand out.
There is an obvious optimization which you do not really
want to make until speed becomes a real problem. Note in the innermost
above that the
JMP IFDONE1 instructions simply jumps to a
which transfers control to
IfDone0. It is very tempting to replace the first
by one which jumps directly to
when you go in and
optimize your code
this would be a good optimization to make. However
you shouldn't make
such optimizations to your code unless you really need the speed. Doing so makes your code
harder to read and understand. Remember
we would like all our control structures to have
one entry and one exit. Changing this jump as described would give the innermost
two exit points.
for loop is another commonly nested
control structure. Once again
the key to building up nested structures is to construct
the outside object first and fill in the inner members afterwards. As an example
the following nested
for loops which add the elements of a pair of two
dimensional arrays together:
for i := 0 to 7 do for k := 0 to 7 do A [i j] := B [i j] + C [i j];
As before begin by constructing the outermost loop first. This code assumes that dx will be the loop control variable for the outermost loop (that is dx is equivalent to "i"):
; for dx := 0 to 7 do mov dx 0 ForLp0: cmp dx 7 jnle EndFor0 ; Put innermost FOR loop here inc dx jmp ForLp0 EndFor0:
Now add the code for the nested for loop. Note the use of the cx register for the loop control variable on the innermost for loop of this code.
; for dx := 0 to 7 do mov dx 0 ForLp0: cmp dx 7 jnle EndFor0 ; for cx := 0 to 7 do mov cx 0 ForLp1: cmp cx 7 jnle EndFor1 ; Put code for A[dx cx] := b[dx cx] + C [dx cx] here inc cx jmp ForLp1 EndFor1: inc dx jmp ForLp0 EndFor0:
Once again the innermost
for loop is in
italics in the above code to make it stand out. The final step is to add the code which
performs that actual computation.
Most of the time the computer runs too slow for most people's tastes. However there are occasions when it actually runs too fast. One common solution is to create an empty loop to waste a small amount of time. In Pascal you will commonly see loops like:
for i := 1 to 10000 do ;
In assembly you might see a comparable loop:
mov cx 8000h DelayLp: loop DelayLp
By carefully choosing the number of iterations you can obtain a relatively accurate delay interval. There is however one catch. That relatively accurate delay interval is only going to be accurate on your machine. If you move your program to a different machine with a different CPU clock speed number of wait states different sized cache or half a dozen other features you will find that your delay loop takes a completely different amount of time. Since there is better than a hundred to one difference in speed between the high end and low end PCs today it should come as no surprise that the loop above will execute 100 times faster on some machines than on others.
The fact that one CPU runs 100 times faster than another does not reduce the need to have a delay loop which executes some fixed amount of time. Indeed it makes the problem that much more important. Fortunately the PC provides a hardware based timer which operates at the same speed regardless of the CPU speed. This timer maintains the time of day for the operating system so it's very important that it run at the same speed whether you're on an 8088 or a Pentium. In the chapter on interrupts you will learn to actually patch into this device to perform various tasks. For now we will simply take advantage of the fact that this timer chip forces the CPU to increment a 32-bit memory location (40:6ch) about 18.2 times per second. By looking at this variable we can determine the speed of the CPU and adjust the count value for an empty loop accordingly.
The basic idea of the following code is to watch the BIOS timer variable until it changes. Once it changes start counting the number of iterations through some sort of loop until the BIOS timer variable changes again. Having noted the number of iterations if you execute a similar loop the same number of times it should require about 1/18.2 seconds to execute.
The following program demonstrates how to create such a
.xlist include stdlib.a includelib stdlib.lib .list ; PPI_B is the I/O address of the keyboard/speaker control ; port. This program accesses it simply to introduce a ; large number of wait states on faster machines. Since the ; PPI (Programmable Peripheral Interface) chip runs at about ; the same speed on all PCs accessing this chip slows most ; machines down to within a factor of two of the slower ; machines. PPI_B equ 61h ; RTC is the address of the BIOS timer variable (40:6ch). ; The BIOS timer interrupt code increments this 32-bit ; location about every 55 ms (1/18.2 seconds). The code ; which initializes everything for the Delay routine ; reads this location to determine when 1/18th seconds ; have passed. RTC textequ <es:[6ch]> dseg segment para public 'data' ; TimedValue contains the number of iterations the delay ; loop must repeat in order to waste 1/18.2 seconds. TimedValue word 0 ; RTC2 is a dummy variable used by the Delay routine to ; simulate accessing a BIOS variable. RTC2 word 0 dseg ends cseg segment para public 'code' assume cs:cseg ds:dseg ; Main program which tests out the DELAY subroutine. Main proc mov ax dseg mov ds ax print byte "Delay test routine" cr lf 0 ; Okay let's see how long it takes to count down 1/18th ; of a second. First point ES as segment 40h in memory. ; The BIOS variables are all in segment 40h. ; ; This code begins by reading the memory timer variable ; and waiting until it changes. Once it changes we can ; begin timing until the next change occurs. That will ; give us 1/18.2 seconds. We cannot start timing right ; away because we might be in the middle of a 1/18.2 ; second period. mov ax 40h mov es ax mov ax RTC RTCMustChange: cmp ax RTC je RTCMustChange ; Okay begin timing the number of iterations it takes ; for an 18th of a second to pass. Note that this ; code must be very similar to the code in the Delay ; routine. mov cx 0 mov si RTC mov dx PPI_B TimeRTC: mov bx 10 DelayLp: in al dx dec bx jne DelayLp cmp si RTC loope TimeRTC neg cx ;CX counted down! mov TimedValue cx ;Save away mov ax ds mov es ax printf byte "TimedValue = %d" cr lf byte "Press any key to continue" cr lf byte "This will begin a delay of five " byte "seconds" cr lf 0 dword TimedValue getc mov cx 90 DelayIt: call Delay18 loop DelayIt Quit: ExitPgm ;DOS macro to quit program. Main endp ; Delay18-This routine delays for approximately 1/18th sec. ; Presumably the variable "TimedValue" in DS has ; been initialized with an appropriate count down ; value before calling this code. Delay18 proc near push ds push es push ax push bx push cx push dx push si mov ax dseg mov es ax mov ds ax ; The following code contains two loops. The inside ; nested loop repeats 10 times. The outside loop ; repeats the number of times determined to waste ; 1/18.2 seconds. This loop accesses the hardware ; port "PPI_B" in order to introduce many wait states ; on the faster processors. This helps even out the ; timings on very fast machines by slowing them down. ; Note that accessing PPI_B is only done to introduce ; these wait states the data read is of no interest ; to this code. ; ; Note the similarity of this code to the code in the ; main program which initializes the TimedValue variable. mov cx TimedValue mov si es:RTC2 mov dx PPI_B TimeRTC: mov bx 10 DelayLp: in al dx dec bx jne DelayLp cmp si es:RTC2 loope TimeRTC pop si pop dx pop cx pop bx pop ax pop es pop ds ret Delay18 endp cseg ends sseg segment para stack 'stack' stk word 1024 dup (0) sseg ends end Main
Chapter Ten: Control Structures (Part
27 SEP 1996