Chapter Nine (Part 5) 
Table of Content  Chapter Nine (Part 7) 
CHAPTER
NINE: ARITHMETIC AND LOGICAL OPERATIONS (Part 6) 

9.5 
Machine and Arithmetic Idioms 9.5.1  Multiplying Without MUL and IMUL 9.5.2  Division Without DIV and IDIV 9.5.3  Using AND to Compute Remainders 
9.5.4 
Implementing Modulon Counters with AND 9.5.5  Testing an Extended Precision Value for 0FFFF..FFh 9.5.6  TEST Operations 9.5.7  Testing Signs with the XOR Instruction 
9.5 Machine and Arithmetic Idioms  
An idiom is an idiosyncrasy. Several arithmetic operations and 80x86 instructions have idiosyncracies that you can take advantage of when writing assembly language code. Some people refer to the use of machine and arithmetic idioms as "tricky programming" that you should always avoid in well written programs. While it is wise to avoid tricks just for the sake of tricks many machine and arithmetic idioms are wellknown and commonly found in assembly language programs. Some of them can be really tricky but a good number of them are simply "tricks of the trade." This text cannot even begin to present all of the idioms in common use today; they are too numerous and the list is constantly changing. Nevertheless there are some very important idioms that you will see all the time so it makes sense to discuss those.
9.5.1 Multiplying Without MUL and IMUL
If you take a quick look at the timing for the multiply
instruction
you'll notice that the execution time for this instruction is rather long.
Only the div
and idiv
instructions take longer on the 8086. When
multiplying by a constant
you can avoid the performance penalty of the mul
and
imul
instructions by using shifts
additions
and subtractions to perform the
multiplication.
Remember
a shl
operation performs the same
operation as multiplying the specified operand by two. Shifting to the left two bit
positions multiplies the operand by four. Shifting to the left three bit positions
multiplies the operand by eight. In general
shifting an operand to the left n bits
multiplies it by 2n. Any value can be multiplied by some constant using a series of shifts
and adds or shifts and subtractions. For example
to multiply the ax
register
by ten
you need only multiply it by eight and then add in two times the original value.
That is
10*ax = 8*ax + 2*ax
. The code to accomplish this is
shl ax 1 ;Multiply AX by two mov bx ax ;Save 2*AX for later shl ax 1 ;Multiply AX by four shl ax 1 ;Multiply AX by eight add ax bx ;Add in 2*AX to get 10*AX
The ax
register (or just about any register
for that matter) can be multiplied by most constant values much faster using shl
than
by using the mul
instruction. This may seem hard to believe since it only
takes two instructions to compute this product:
mov bx 10 mul bx
However
if you look at the timings
the shift and add
example above requires fewer clock cycles on most processors in the 80x86 family than the mul
instruction. Of course
the code is somewhat larger (by a few bytes)
but the
performance improvement is usually worth it. Of course
on the later 80x86 processors
the
mul
instruction is quite a bit faster than the earlier processors
but the
shift and add scheme is generally faster on these processors as well.
You can also use subtraction with shifts to perform a multiplication operation. Consider the following multiplication by seven:
mov bx ax ;Save AX*1 shl ax 1 ;AX := AX*2 shl ax 1 ;AX := AX*4 shl ax 1 ;AX := AX*8 sub ax bx ;AX := AX*7
This follows directly from the fact that ax*7 =
(ax*8)ax
.
A common error made by beginning assembly language students
is subtracting or adding one or two rather than ax*1
or ax*2
.
The following does not compute ax*7
:
shl ax 1 shl ax 1 shl ax 1 sub ax 1
It computes (8*ax)1 something entirely different (unless of course ax = 1). Beware of this pitfall when using shifts additions and subtractions to perform multiplication operations.
You can also use the lea instruction to compute certain products on 80386 and later processors. The trick is to use the 80386 scaled index mode. The following examples demonstrate some simple cases:
lea eax [ecx][ecx] ;EAX := ECX * 2 lea eax [eax]eax*2] ;EAX := EAX * 3 lea eax [eax*4] ;EAX := EAX * 4 lea eax [ebx][ebx*4] ;EAX := EBX * 5 lea eax [eax*8] ;EAX := EAX * 8 lea eax [edx][edx*8] ;EAX := EDX * 9
Much as the shl
instruction can be used for
simulating a multiplication by some power of two
the shr
and sar
instructions
can be used to simulate a division by a power of two. Unfortunately
you cannot use
shifts
additions
and subtractions to perform a division by an arbitrary constant as
easily as you can use these instructions to perform a multiplication operation.
Another way to perform division is to use the multiply instructions. You can divide by some value by multiplying by its reciprocal. The multiply instruction is marginally faster than the divide instruction; multiplying by a reciprocal is almost always faster than division.
Now you're probably wondering "how does one multiply
by a reciprocal when the values we're dealing with are all integers?" The answer
of
course
is that we must cheat to do this. If you want to multiply by one tenth
there is
no way you can load the value 1/10th into an 80x86 register prior to performing the
division. However
we could multiply 1/10th by 10
perform the multiplication
and then
divide the result by ten to get the final result. Of course
this wouldn't buy you
anything at all
in fact it would make things worse since you're now doing a
multiplication by ten as well as a division by ten. However
suppose you multiply 1/10th
by 65
536 (6553)
perform the multiplication
and then divide by 65
536. This would still
perform the correct operation and
as it turns out
if you set up the problem correctly
you can get the division operation for free. Consider the following code that divides ax
by ten:
mov dx 6554 ;Round (65 536/10) mul dx
This code leaves ax
/10 in the dx
register.
To understand how this works
consider what happens when
you multiply ax
by 65
536 (10000h). This simply moves ax
into dx
and sets ax
to zero. Multiplying by 6
554 (65
536 divided by ten) puts
ax
divided by ten into the dx
register. Since mul
is
marginally faster than div
this technique runs a little faster than using a
straight division.
Multiplying by the reciprocal works well when you need to divide by a constant. You could even use it to divide by a variable but the overhead to compute the reciprocal only pays off if you perform the division many many times (by the same value).
9.5.3 Using AND to Compute Remainders
The and
instruction can be used to quickly
compute remainders of the form:
dest := dest MOD 2n
To compute a remainder using the and
instruction
simply and the operand with the value 2n1. For example
to compute ax = ax mod 8
simply
use the instruction:
and ax 7
Additional examples:
and ax 3 ;AX := AX mod 4 and ax 0Fh ;AX := AX mod 16 and ax 1Fh ;AX := AX mod 32 and ax 3Fh ;AX := AX mod 64 and ax 7Fh ;AX := AX mod 128 mov ah 0 ;AX := AX mod 256 ; (Same as ax and 0FFh)
If you want to implement a counter variable that counts up to 2n1 and then resets to zero simply using the following code:
inc CounterVar and CounterVar nBits
where nBits is a binary value containing n one bits right justified in the number. For example to create a counter that cycles between zero and fifteen you could use the following:
inc CounterVar and CounterVar 00001111b
The and
instruction can be used to quickly
check a multiword value to see if it contains ones in all its bit positions. Simply load
the first word into the ax
register and then logically and the ax
register
with all the remaining words in the data structure. When the and operation is complete
the ax
register will contain 0FFFFh if and only if all the words in that
structure contained 0FFFFh. E.g.
mov ax word ptr var and ax word ptr var+2 and ax word ptr var+4 . . . and ax word ptr var+n cmp ax 0FFFFh je Is0FFFFh
Remember
the test
instruction is an and
instruction that doesn't retain the results of the and operation (other than the
flag settings). Therefore
many of the comments concerning the and operation (particularly
with respect to the way it affects the flags) also hold for the test
instruction.
However
since the test
instruction doesn't affect the destination operand
multiple bit tests may be performed on the same value. Consider the following code:
test ax 1 jnz Bit0 test ax 2 jnz Bit1 test ax 4 jnz Bit3 etc.
This code can be used to successively test each bit in the ax
register (or any other operand for that matter). Note that you cannot use the test/cmp
instruction pair to test for a specific value within a string of bits (as you can
with the and/cmp
instructions). Since test
doesn't strip out any
unwanted bits
the cmp
instruction would actually be comparing the original
value rather than the stripped value. For this reason
you'll normally use the test
instruction
to see if a single bit is set or if one or more bits out of a group of bits are set. Of
course
if you have an 80386 or later processor
you can also use the bt
instruction to test individual bits in an operand.
Another important use of the test
instruction
is to efficiently compare a register against zero. The following test instruction sets the
zero flag if and only if ax
contains zero (anything anded with itself
produces its original value; this sets the zero flag only if that value is zero):
test ax ax
The test instruction is shorter than
cmp ax 0 or
cmp eax 0
though it is no better than
cmp al 0
Note that you can use the and
and or
instructions
to test for zero in a fashion identical to test
. However
on pipelined
processors like the 80486 and Pentium chips
the test
instruction is less
likely to create a hazard since it does not store a result back into its destination
register.
9.5.7 Testing Signs with the XOR Instruction
Remember the pain associated with a multiprecision signed
multiplication operation? You need to determine the sign of the result
take the absolute
value of the operands
multiply them
and then adjust the sign of the result as determined
before the multiplication operation. The sign of the product of two numbers is simply the
exclusiveor of their signs before performing the multiplication. Therefore
you can use
the xor
instruction to determine the sign of the product of two extended
precision numbers. E.g.
32x32 Multiply: mov al byte ptr Oprnd1+3 xor al byte ptr Oprnd2+3 mov cl al ;Save sign ; Do the multiplication here (don't forget to take the absolute ; value of the two operands before performing the multiply). . . . ; Now fix the sign. cmp cl 0 ;Check sign bit jns ResultIsPos ; Negate the product here. . . . ResultIsPos:
Chapter Nine (Part 5) 
Table of Content  Chapter Nine (Part 7) 
Chapter Nine: Arithmetic And Logical
Operations (Part 6)
27 SEP 1996