DATA REPRESENTATION (Part 2)
The Hexadecimal Numbering System
1.4 - Arithmetic Operations on Binary and Hexadecimal Numbers
1.5 - Logical Operations on Bits
Logical Operations on Binary Numbers and Bit Strings
1.7 - Signed and Unsigned Numbers
1.8 - Sign and Zero Extension
A big problem with the binary system is verbosity. To represent the value 202 (decimal) requires eight binary digits. The decimal version requires only three decimal digits and thus represents numbers much more compactly than does the binary numbering system. This fact was not lost on the engineers who designed binary computer systems. When dealing with large values binary numbers quickly become too unwieldy. Unfortunately the computer thinks in binary so most of the time it is convenient to use the binary numbering system. Although we can convert between decimal and binary the conversion is not a trivial task. The hexadecimal (base 16) numbering system solves these problems. Hexadecimal numbers offer the two features we're looking for: they're very compact and it's simple to convert them to binary and vice versa. Because of this most binary computer systems today use the hexadecimal numbering system. Since the radix (base) of a hexadecimal number is 16 each hexadecimal digit to the left of the hexadecimal point represents some value times a successive power of 16. For example the number 1234 (hexadecimal) is equal to:
1 * 16**3 + 2 * 16**2 + 3 * 16**1 + 4 * 16**0
4096 + 512 + 48 + 4 = 4660 (decimal).
Each hexadecimal digit can represent one of sixteen values between 0 and 15. Since there are only ten decimal digits we need to invent six additional digits to represent the values in the range 10 through 15. Rather than create new symbols for these digits we'll use the letters A through F. The following are all examples of valid hexadecimal numbers:
1234 DEAD BEEF 0AFB FEED DEAF
Since we'll often need to enter hexadecimal numbers into the computer system we'll need a different mechanism for representing hexadecimal numbers. After all on most computer systems you cannot enter a subscript to denote the radix of the associated value. We'll adopt the following conventions:
Examples of valid hexadecimal numbers:
1234h 0DEADh 0BEEFh 0AFBh 0FEEDh 0DEAFh
As you can see hexadecimal numbers are compact and easy to read. In addition you can easily convert between hexadecimal and binary. Consider the following table:
This table provides all the information you'll ever need to convert any hexadecimal number into a binary number or vice versa.
To convert a hexadecimal number into a binary number simply substitute the corresponding four bits for each hexadecimal digit in the number. For example to convert 0ABCDh into a binary value simply convert each hexadecimal digit according to the table above:
0 A B C D Hexadecimal
0000 1010 1011 1100 1101 Binary
To convert a binary number into hexadecimal format is almost as easy. The first step is to pad the binary number with zeros to make sure that there is a multiple of four bits in the number. For example given the binary number 1011001010 the first step would be to add two bits to the left of the number so that it contains 12 bits. The converted binary value is 001011001010. The next step is to separate the binary value into groups of four bits e.g. 0010 1100 1010. Finally look up these binary values in the table above and substitute the appropriate hexadecimal digits e.g. 2CA. Contrast this with the difficulty of conversion between decimal and binary or decimal and hexadecimal!
Since converting between hexadecimal and binary is an operation you will need to perform over and over again you should take a few minutes and memorize the table above. Even if you have a calculator that will do the conversion for you you'll find manual conversion to be a lot faster and more convenient when converting between binary and hex.
There are several operations we can perform on binary and hexadecimal numbers. For example we can add subtract multiply divide and perform other arithmetic operations. Although you needn't become an expert at it you should be able to in a pinch perform these operations manually using a piece of paper and a pencil. Having just said that you should be able to perform these operations manually the correct way to perform such arithmetic operations is to have a calculator which does them for you. There are several such calculators on the market; the following table lists some of the manufacturers who produce such devices:
Manufacturers of Hexadecimal Calculators:
This list is by no means exhaustive. Other calculator manufacturers probably produce these devices as well. The Hewlett-Packard devices are arguably the best of the bunch . However they are more expensive than the others. Sharp and Casio produce units which sell for well under $50. If you plan on doing any assembly language programming at all owning one of these calculators is essential.
Another alternative to purchasing a hexadecimal calculator is to obtain a TSR (Terminate and Stay Resident) program such as SideKick which contains a built-in calculator. However unless you already have one of these programs or you need some of the other features they offer such programs are not a particularly good value since they cost more than an actual calculator and are not as convenient to use.
To understand why you should spend the money on a calculator consider the following arithmetic problem:
9h + 1h ----
You're probably tempted to write in the answer "10h" as the solution to this problem. But that is not correct! The correct answer is ten which is "0Ah" not sixteen which is "10h". A similar problem exists with the arithmetic problem:
10h - 1h ----
You're probably tempted to answer "9h" even though the true answer is "0Fh". Remember this problem is asking "what is the difference between sixteen and one?" The answer of course is fifteen which is "0Fh".
Even if the two problems above don't bother you in a stressful situation your brain will switch back into decimal mode while you're thinking about something else and you'll produce the incorrect result. Moral of the story - if you must do an arithmetic computation using hexadecimal numbers by hand take your time and be careful about it. Either that or convert the numbers to decimal perform the operation in decimal and convert them back to hexadecimal.
You should never perform binary arithmetic computations. Since binary numbers usually contain long strings of bits there is too much of an opportunity for you to make a mistake. Always convert binary numbers to hex perform the operation in hex (preferably with a hex calculator) and convert the result back to binary if necessary.
There are four main logical operations we'll need to perform on hexadecimal and binary numbers: AND OR XOR (exclusive-or) and NOT. Unlike the arithmetic operations a hexadecimal calculator isn't necessary to perform these operations. It is often easier to do them by hand than to use an electronic device to compute them. The logical AND operation is a dyadic operation (meaning it accepts exactly two operands). These operands are single binary (base 2) bits. The AND operation is:
0 and 0 = 0 0 and 1 = 0 1 and 0 = 0 1 and 1 = 1
A compact way to represent the logical AND operation is with a truth table. A truth table takes the following form:
This is just like the multiplication tables you encountered in elementary school. The column on the left and the row at the top represent input values to the AND operation. The value located at the intersection of the row and column (for a particular pair of input values) is the result of logically ANDing those two values together. In English the logical AND operation is "If the first operand is one and the second operand is one the result is one; otherwise the result is zero."
One important fact to note about the logical AND operation is that you can use it to force a zero result. If one of the operands is zero the result is always zero regardless of the other operand. In the truth table above for example the row labelled with a zero input contains only zeros and the column labelled with a zero only contains zero results. Conversely if one operand contains a one the result is exactly the value of the second operand. These features of the AND operation are very important particularly when working with bit strings and we want to force individual bits in the string to zero. We will investigate these uses of the logical AND operation in the next section.
The logical OR operation is also a dyadic operation. Its definition is:
0 or 0 = 0 0 or 1 = 1 1 or 0 = 1 1 or 1 = 1
The truth table for the OR operation takes the following form:
Colloquially the logical OR operation is "If the first operand or the second operand (or both) is one the result is one; otherwise the result is zero." This is also known as the inclusive-OR operation.
If one of the operands to the logical-OR operation is a one the result is always one regardless of the second operand's value. If one operand is zero the result is always the value of the second operand. Like the logical AND operation this is an important side-effect of the logical-OR operation that will prove quite useful when working with bit strings (see the next section).
Note that there is a difference between this form of the inclusive logical OR operation and the standard English meaning. Consider the phrase "I am going to the store or I am going to the park." Such a statement implies that the speaker is going to the store or to the park but not to both places. Therefore the English version of logical OR is slightly different than the inclusive-OR operation; indeed it is closer to the exclusive-OR operation.
The logical XOR (exclusive-or) operation is also a dyadic operation. It is defined as follows:
0 xor 0 = 0
0 xor 1 = 1
1 xor 0 = 1
1 xor 1 = 0
The truth table for the XOR operation takes the following form:
In English the logical XOR operation is "If the first operand or the second operand but not both is one the result is one; otherwise the result is zero." Note that the exclusive-or operation is closer to the English meaning of the word "or" than is the logical OR operation.
If one of the operands to the logical exclusive-OR operation is a one the result is always the inverse of the other operand; that is if one operand is one the result is zero if the other operand is one and the result is one if the other operand is zero. If the first operand contains a zero then the result is exactly the value of the second operand. This feature lets you selectively invert bits in a bit string.
The logical NOT operation is a monadic operation (meaning it accepts only one operand). It is:
NOT 0 = 1 NOT 1 = 0
The truth table for the NOT operation takes the following form:
As described in the previous section the logical functions work only with single bit operands. Since the 80x86 uses groups of eight sixteen or thirty-two bits we need to extend the definition of these functions to deal with more than two bits. Logical functions on the 80x86 operate on a bit-by-bit (or bitwise) basis. Given two values these functions operate on bit zero producing bit zero of the result. They operate on bit one of the input values producing bit one of the result etc. For example if you want to compute the logical AND of the following two eight-bit numbers you would perform the logical AND operation on each column independently of the others:
1011 0101 1110 1110 --------- 1010 0100
This bit-by-bit form of execution can be easily applied to the other logical operations as well.
Since we've defined logical operations in terms of binary values you'll find it much easier to perform logical operations on binary values than on values in other bases. Therefore if you want to perform a logical operation on two hexadecimal numbers you should convert them to binary first. This applies to most of the basic logical operations on binary numbers (e.g. AND OR XOR etc.).
The ability to force bits to zero or one using the logical AND/OR operations and the ability to invert bits using the logical XOR operation is very important when working with strings of bits (e.g. binary numbers). These operations let you selectively manipulate certain bits within some value while leaving other bits unaffected. For example if you have an eight-bit binary value 'X' and you want to guarantee that bits four through seven contain zeros you could logically AND the value 'X' with the binary value 0000 1111. This bitwise logical AND operation would force the H.O. four bits to zero and pass the L.O. four bits of 'X' through unchanged. Likewise you could force the L.O. bit of 'X' to one and invert bit number two of 'X' by logically ORing 'X' with 0000 0001 and logically exclusive-ORing 'X' with 0000 0100 respectively. Using the logical AND OR and XOR operations to manipulate bit strings in this fashion is know as masking bit strings. We use the term masking because we can use certain values (one for AND zero for OR/XOR) to 'mask out' certain bits from the operation when forcing bits to zero one or their inverse.
So far we've treated binary numbers as unsigned values. The binary number ...00000 represents zero ...00001 represents one ...00010 represents two and so on toward infinity. What about negative numbers? Signed values have been tossed around in previous sections and we've mentioned the two's complement numbering system but we haven't discussed how to represent negative numbers using the binary numbering system. That is what this section is all about!
To represent signed numbers using the binary numbering system we have to place a restriction on our numbers: they must have a finite and fixed number of bits. As far as the 80x86 goes this isn't too much of a restriction after all the 80x86 can only address a finite number of bits. For our purposes we're going to severely limit the number of bits to eight 16 32 or some other small number of bits.
With a fixed number of bits we can only represent a certain number of objects. For example with eight bits we can only represent 256 different objects. Negative values are objects in their own right just like positive numbers. Therefore we'll have to use some of the 256 different values to represent negative numbers. In other words we've got to use up some of the positive numbers to represent negative numbers. To make things fair we'll assign half of the possible combinations to the negative values and half to the positive values. So we can represent the negative values -128..-1 and the positive values 0..127 with a single eight bit byte. With a 16-bit word we can represent values in the range -32 768..+32 767. With a 32-bit double word we can represent values in the range -2 147 483 648..+2 147 483 647. In general with n bits we can represent the signed values in the range -2**(n-1)to +2**(n-1)-1.
Okay so we can represent negative values. Exactly how do we do it? Well there are many ways but the 80x86 microprocessor uses the two's complement notation. In the two's complement system the H.O. bit of a number is a sign bit. If the H.O. bit is zero the number is positive; if the H.O. bit is one the number is negative. Examples:
For 16-bit numbers:
8000h is negative because the H.O. bit is one.
100h is positive because the H.O. bit is zero.
7FFFh is positive.
0FFFFh is negative.
0FFFh is positive.
If the H.O. bit is zero then the number is positive and is stored as a standard binary value. If the H.O. bit is one then the number is negative and is stored in the two's complement form. To convert a positive number to its negative two's complement form you use the following algorithm:
1) Invert all the bits in the number i.e. apply the logical NOT function.
2) Add one to the inverted result.
For example to compute the eight bit equivalent of -5:
0000 0101 Five (in binary). 1111 1010 Invert all the bits. 1111 1011 Add one to obtain result.
If we take minus five and perform the two's complement operation on it we get our original value 00000101 back again just as we expect:
1111 1011 Two's complement for -5. 0000 0100 Invert all the bits. 0000 0101 Add one to obtain result (+5).
The following examples provide some positive and negative 16-bit signed values:
7FFFh: +32767 the largest 16-bit positive number. 8000h: -32768 the smallest 16-bit negative number. 4000h: +16 384.
To convert the numbers above to their negative counterpart (i.e. to negate them) do the following:
7FFFh: 0111 1111 1111 1111 +32 767t 1000 0000 0000 0000 Invert all the bits (8000h) 1000 0000 0000 0001 Add one (8001h or -32 767t)
8000h: 1000 0000 0000 0000 -32 768t 0111 1111 1111 1111 Invert all the bits (7FFFh) 1000 0000 0000 0000 Add one (8000h or -32768t)
4000h: 0100 0000 0000 0000 16 384t 1011 1111 1111 1111 Invert all the bits (BFFFh) 1100 0000 0000 0000 Add one (0C000h or -16 384t)
8000h inverted becomes 7FFFh. After adding one we obtain 8000h! Wait what's going on here? -(-32 768) is -32 768? Of course not. But the value +32 768 cannot be represented with a 16-bit signed number so we cannot negate the smallest negative value. If you attempt this operation the 80x86 microprocessor will complain about signed arithmetic overflow.
Why bother with such a miserable numbering system? Why not use the H.O. bit as a sign flag storing the positive equivalent of the number in the remaining bits? The answer lies in the hardware. As it turns out negating values is the only tedious job. With the two's complement system most other operations are as easy as the binary system. For example suppose you were to perform the addition 5+(-5). The result is zero. Consider what happens when we add these two values in the two's complement system:
00000101 11111011 -------- 1 00000000
We end up with a carry into the ninth bit and all other bits are zero. As it turns out if we ignore the carry out of the H.O. bit adding two signed values always produces the correct result when using the two's complement numbering system. This means we can use the same hardware for signed and unsigned addition and subtraction. This wouldn't be the case with some other numbering systems.
Except for the questions at the end of this chapter you will not need to perform the two's complement operation by hand. The 80x86 microprocessor provides an instruction NEG (negate) which performs this operation for you. Furthermore all the hexadecimal calculators will perform this operation by pressing the change sign key (+/- or CHS). Nevertheless performing a two's complement by hand is easy and you should know how to do it.
Once again you should note that the data represented by a set of binary bits depends entirely on the context. The eight bit binary value 11000000b could represent an IBM/ASCII character it could represent the unsigned decimal value 192 or it could represent the signed decimal value -64 etc. As the programmer it is your responsibility to use this data consistently.
Since two's complement format integers have a fixed length a small problem develops. What happens if you need to convert an eight bit two's complement value to 16 bits? This problem and its converse (converting a 16 bit value to eight bits) can be accomplished via sign extension and contraction operations. Likewise the 80x86 works with fixed length values even when processing unsigned binary numbers. Zero extension lets you convert small unsigned values to larger unsigned values.
Consider the value "-64". The eight bit two's complement value for this number is 0C0h. The 16-bit equivalent of this number is 0FFC0h. Now consider the value "+64". The eight and 16 bit versions of this value are 40h and 0040h. The difference between the eight and 16 bit numbers can be described by the rule: "If the number is negative the H.O. byte of the 16 bit number contains 0FFh; if the number is positive the H.O. byte of the 16 bit quantity is zero."
To sign extend a value from some number of bits to a greater number of bits is easy just copy the sign bit into all the additional bits in the new format. For example to sign extend an eight bit number to a 16 bit number simply copy bit seven of the eight bit number into bits 8..15 of the 16 bit number. To sign extend a 16 bit number to a double word simply copy bit 15 into bits 16..31 of the double word.
Sign extension is required when manipulating signed values of varying lengths. Often you'll need to add a byte quantity to a word quantity. You must sign extend the byte quantity to a word before the operation takes place. Other operations (multiplication and division in particular) may require a sign extension to 32-bits. You must not sign extend unsigned values.
Examples of sign extension:
Eight Bits Sixteen Bits Thirty-two Bits
80h FF80h FFFFFF80h 28h 0028h 00000028h 9Ah FF9Ah FFFFFF9Ah 7Fh 007Fh 0000007Fh --- 1020h 00001020h --- 8088h FFFF8088h
To extend an unsigned byte you must zero extend the value. Zero extension is very easy - just store a zero into the H.O. byte(s) of the smaller operand. For example to zero extend the value 82h to 16-bits you simply add a zero to the H.O. byte yielding 0082h.
Eight Bits Sixteen Bits Thirty-two Bits 80h 0080h 00000080h 28h 0028h 00000028h 9Ah 009Ah 0000009Ah 7Fh 007Fh 0000007Fh --- 1020h 00001020h --- 8088h 00008088h
Sign contraction converting a value with some number of bits to the identical value with a fewer number of bits is a little more troublesome. Sign extension never fails. Given an m-bit signed value you can always convert it to an n-bit number (where n > m) using sign extension. Unfortunately given an n-bit number you cannot always convert it to an m-bit number if m < n. For example consider the value -448. As a 16-bit hexadecimal number its representation is 0FE40h. Unfortunately the magnitude of this number is too great to fit into an eight bit value so you cannot sign contract it to eight bits. This is an example of an overflow condition that occurs upon conversion.
To properly sign contract one value to another you must look at the H.O. byte(s) that you want to discard. The H.O. bytes you wish to remove must all contain either zero or 0FFh. If you encounter any other values you cannot contract it without overflow. Finally the H.O. bit of your resulting value must match every bit you've removed from the number. Examples (16 bits to eight bits):
FF80h can be sign contracted to 80h 0040h can be sign contracted to 40h FE40h cannot be sign contracted to 8 bits. 0100h cannot be sign contracted to 8 bits.
Chapter One: Data Representation
26 SEP 1996