FOR FREE CONTENT

IEEE Floating Point Representation

 

Concept of IEEE Floating-Point Representation

A floating-point number (say, 11.01) can be represented in several ways, 

e.g.

Thus, a specific representation indeed is required to save some bit patterns. The method used to represent any number in its standard format of the representation is called Normalization. Generally, two different representation standards are used –

 

1. Fractional form/Explicit Normalization

2. Implicit Normalization

 

Here, represents sign of the number, where S = 0 and 1 correspond to positive and numbers negative respectively. M stands for Mantissa and E stands for Exponent.

 

Need for biasing the exponent:

In floating-point representation, the exponent is biased in the engineering sense so as to enable the exponent bids to store the value as the offset from the actual value. Biasing is done because exponents have to be signed values in order to be able to represent both thing and huge values, but 2’s complement, the usual representation for signed values would make the comparison harder. To solve this problem, the exponent is biased before being storied (adding a number to the original exponent while storing the value), by adjusting its value to put it within an unsigned range suitable for comparison. 

 

By arranging the fields so that sign bit is in the most significant position, the biased exponent in the middle, then the mantissa (alternatively, known as significant) in the least significant bits as shown below, the resulting value will be ordered properly, whether it is interpreted as a floating-point as an integer value. These also allow high-speed comparisons of floating point numbers using fixed-point hardware. During interpretation of the actual numbers, the bias is subtracted to retrieve the actual exponent.

 

If the numbers of bits to store the exponent is K bits, then the value of the bias which should be added is –

 

 

 

Note:

Among the two possible normalization techniques, 

IEEE uses implicit normalization

If nothing is mentioned, then explicit normalization will be used.

 

Example 1:

Consider a 16-bit register of the following form is used to store floating-point number Mantissa is denoted as a normalized sign-magnitude fraction. Exponent is expressed in excess -64 formats. Base of the system is 2.

 

 

Solution:

1 bit is always allocated as sign bit.

 

Solution 1:

8’ bits are allocated for the fractional mantissa.

 

 

Solution 2:

(Here, explicit normalization is to be used).

 

Solution 3:

If K bits (here, K = 7) are used to represent exponent, 

 

As the mantissa is of 8 bits

Solution 4: