Saturday 20 October 2012

1.3 Binary Number Operation

Binary numbers let you represent any amount you want using only 2 digits: 0s and 1s.


Decimal
Binary
1
1
2
10
3
11
4
100
5
101
6
110
7
111
8
1000
9
1001
10
1010

Table 1: Examples of decimal and binary numbers

How do we count using binary?
Since we were small, we’re taught to count in ascending order 1, 2, 3, 4… But it’s slightly different using binary.

Addition
Decimal number
Both binary and decimal addition steps are similar, let’s recall again by this simply decimal addition example, which is the addition of 5681 and 3253.

Figure 1: Decimal addition

Basically, it’s starting from adding the least significant digits on the right and write down the result. The last digit of the answer will be written under the column and the remaining numbers will bring forward to next pair of digits to the left, and work the way up to the most significant digits.

Binary number
The addition steps shown above works similarly in binary addition. Decimal system uses power of 10 and the binary system uses power of 2. Which means, in decimal addition, if the result for a column addition is 10, “1” will bring forward to the next column and “0” remained as the answer.
While in binary addition,



Decimal
Binary
0+1=
1
1
1+1=
2
10
1+1+1=
3
11
1+1+1+1=
4
100
1+1+1+1+1=
5
101


For example,
Addition of 01010101 (decimal=85) and 01110100 (decimal=116):


Figure 2: Binary Addition


Subtraction
Binary subtraction works with the same concept in binary addition, but it’s the inverse version of addition. This means, it can subtract the numbers like normal decimal subtraction, and borrow “1” from next value if it’s not enough to subtract the value in same column.

Figure 3: Difference between binary and decimal numbers subtraction

In binary subtraction,



Decimal
Binary
1-0=
1
1
0-0=
0
0
1-1=
0
0
0-1=
-1
1 (*)
                                                                                        *Borrow “1” from the next value

Subtracting a "1" digit from a "0" digit produces the digit "1", while 1 will have to be subtracted from the next column. This is known as borrowing. The principle is the same as for carrying. When the result of a subtraction is less than 0, the least possible value of a digit, the procedure is to "borrow" the deficit divided by the radix (that is, 10/10) from the left, subtracting it from the next positional value.

Example for binary subtraction,
Subtraction of 100001 (dec=33) & 11111 (dec=31) ; 11100 (dec=28) & 1111 (dec=15):



Multiplication
The product of two N-bit numbers requires 2N bits to contain all possible values. If the precision of the two two's-complement operands is doubled before the multiplication, direct multiplication (discarding any excess bits beyond that precision) will provide the correct result.

In binary multiplication,


Decimal
Binary
0x0=
0
0
0x1=
0
0
1x0=
0
0
1x1=
1
1

It’s pretty much the same with decimal number multiplication.

For example, take 6 × −5 = −30. First, the precision is extended from 4 bits to 8. Then the numbers are multiplied, discarding the bits beyond 8 (shown by 'x'):


         00000110  (6)
       *   11111011  (−5)
 ============
          110
         1100
        00000
       110000
      1100000
     11000000
    x10000000
   xx00000000
 ============
   xx11100010


This is very inefficient; by doubling the precision ahead of time, all additions must be double-precision and at least twice as many partial products are needed than for the more efficient algorithms actually implemented in computers. Some multiplication algorithms are designed for two's complement, notably Booth's multiplication algorithm. Methods for multiplying sign-magnitude numbers don't work with two's-complement numbers without adaptation. There isn't usually a problem when the multiplicand (the one being repeatedly added to form the product) is negative; the issue is setting the initial bits of the product correctly when the multiplier is negative. Two methods for adapting algorithms to handle two's-complement numbers are common:
  • First check to see if the multiplier is negative. If so, negate (i.e., take the two's complement of) both operands before multiplying. The multiplier will then be positive so the algorithm will work. Because both operands are negated, the result will still have the correct sign.
  • Subtract the partial product resulting from the MSB (pseudo sign bit) instead of adding it like the other partial products. This method requires the multiplicand's sign bit to be extended by one position, being preserved during the shift right actions.

As an example of the second method, take the common add-and-shift algorithm for multiplication. Instead of shifting partial products to the left as is done with pencil and paper, the accumulated product is shifted right, into a second register that will eventually hold the least significant half of the product. Since the least significant bits are not changed once they are calculated, the additions can be single precision, accumulating in the register that will eventually hold the most significant half of the product. In the following example, again multiplying 6 by −5, the two registers and the extended sign bit are separated by "|":


0 0110  (6)  (multiplicand with extended sign bit)
 × 1011 (−5)  (multiplier)
 =|====|====
 0|0110|0000  (first partial product (rightmost bit is 1))
 0|0011|0000  (shift right, preserving extended sign bit)
 0|1001|0000  (add second partial product (next bit is 1))
 0|0100|1000  (shift right, preserving extended sign bit)
 0|0100|1000  (add third partial product: 0 so no change)
 0|0010|0100  (shift right, preserving extended sign bit)
 1|1100|0100  (subtract last partial product since it's from sign bit)
 1|1110|0010  (shift right, preserving extended sign bit)
  |1110|0010  (discard extended sign bit, giving the final answer, -30)



Division
Binary division is again similar to its decimal counterpart:
Here, the divisor is 1002, or 4 decimal, while the dividend is 111012, or 29 decimal. The procedure is the same as that of decimal long division; here, the divisor 1002 goes into the first three digits 1112 of the dividend one time, so a "1" is written on the top line. This result is multiplied by the divisor, and subtracted from the first three digits of the dividend; the next digit (a "0") is included to obtain a new three-digit sequence:


         1
        ___________
1 0 0   ) 1 1 1 0 1 
   − 1 0 0
      -----
     1 1 0


The procedure is then repeated with the new sequence, continuing until the digits in the dividend have been exhausted:


             1 1 1
       ___________
1 0 0  ) 1 1 1 0 1
   − 1 0 0
     -----
         1 1 0
       − 1 0 0
         -----
             1 0 1
           − 1 0 0
             -----
                 1


Thus, the quotient of 111012 divided by 1002 is 1112, as shown on the top line, while the remainder, shown on the bottom line, is 12. In decimal, 29 divided by 4 is 7, with a remainder of 1.


Overflow
One caveat with signed binary numbers is that of overflow, where the answer to an addition or subtraction problem exceeds the magnitude which can be represented with the allotted number of bits. With five bits to represent magnitude, we have a representation range of 25 or thirty-two integer steps from 0 to maximum. This means that we can represent a number as high as +3110 (0111112), or as low as -3210 (1000002).

For example, if we set up an addition problem with two binary numbers, the sixth bit used for sign, and the result either exceeds +3110 or is less than -3210 , the answer will be incorrect. Let's try adding 1710 and 1910 to see how this overflow condition works for excessive positive numbers:


   1710 = 100012               1910 = 100112

                         1  11  <--- Carry bits
(Showing sign bits)      010001
                       + 010011
                       --------
                         100100


The answer (1001002), interpreted with the sixth bit as the -3210 place, is actually equal to -2810, not +3610 as we should get with +1710 and +1910 added together! Obviously, this is not correct. What went wrong? The answer lies in the restrictions of the six-bit number field within which we're working since the magnitude of the true and proper sum (3610) exceeds the allowable limit for our designated bit field, we have an overflow error. Simply put, six places doesn't give enough bits to represent the correct sum, so whatever figure we obtain using the strategy of discarding the left-most "carry" bit will be incorrect.

A similar error will occur if we add two negative numbers together to produce a sum that is too low for our six-bit binary field. Let's try adding -1710 and -1910 together to check whether this works:


  -1710 = 100012              -1910 = 100112

                        1 1111  <--- Carry bits
(Showing sign bits)      101111
                       + 101101
                       --------
                        1011100
                        |
               Discard extra bit


The (incorrect) answer is a +28. The fact that the real sum of negative seventeen and negative nineteen was too low to be properly represented with a five bit magnitude field and a sixth sign bit is the root cause of this difficulty.

Let's try these two problems again, except this time using the seventh bit for a sign bit, and allowing the use of 6 bits for representing the magnitude:


     1  11                   11 1111      
  0010001                1101111
+ 0010011              + 1101101    
---------              ---------
  01001002              110111002
                          |
                 Discard extra bit


By using bit fields sufficiently large to handle the magnitude of the sums, we arrive at the correct answers.

In these sample problems we've been able to detect overflow errors by performing the addition problems in decimal form and comparing the results with the binary answers. For example, when adding +1710 and +1910 together, we knew that the answer was supposed to be +3610, so when the binary sum checked out to be -2810, we knew that something had to be wrong. Although this is a valid way of detecting overflow, it is not very efficient. After all, the whole idea of complementation is to be able to reliably add binary numbers together and not have to double-check the result by adding the same numbers together in decimal form! This is especially true for the purpose of building electronic circuits to add binary quantities together: the circuit has to be able to check itself for overflow without the supervision of a human being who already knows what the correct answer is.

What we need is a simple error-detection method that doesn't require any additional arithmetic. Perhaps the most elegant solution is to check for the sign of the sum and compare it against the signs of the numbers added. Obviously, two positive numbers added together should give a positive result, and two negative numbers added together should give a negative result. Notice that whenever we had a condition of overflow in the example problems, the sign of the sum was always opposite of the two added numbers: +1710 plus +1910 giving -2810, or -1710 plus -1910 giving +2810. By checking the signs alone we are able to tell that something is wrong.

But what about cases where a positive number is added to a negative number? What sign should the sum be in order to be correct? Or, more precisely, what sign of sum would necessarily indicate an overflow error? The answer to this is equally elegant: there will never be an overflow error when two numbers of opposite signs are added together! The reason for this is apparent when the nature of overflow is considered. Overflow occurs when the magnitude of a number exceeds the range allowed by the size of the bit field. The sum of two identically-signed numbers may very well exceed the range of the bit field of those two numbers, and so in this case overflow is a possibility. However, if a positive number is added to a negative number, the sum will always be closer to zero than either of the two added numbers: its magnitude must be less than the magnitude of either original number, and so overflow is impossible.

No comments:

Post a Comment