FOR FREE MATERIALS

Subject Mock

Examples on IEEE number

Before continuing these examples please follow the previous chapter: Concept of IEEE-Floating Point Representation

Example 1:

Consider a 16-bit floating-point number where the mantissa is a sign-magnitude fraction and exponent is in biased form bits are allocated for the mantissa.

Answer:

Solution (a):

Exponent is of 16-(9+1) =16-10 = 6 bits

Solution (b):

Range of positive mantissa:

Mantissa is of a bit, so the maximum value of mantissa =

and the minimum value of mantissa =

Number representation in the format specified:

In floating-point representation, every number cannot be represented accurately, since number distribution is non-uniform and non-continuous. So, there is a chance of ever, as precision is done.

First Maximum +ve number:

Second maximum +ve number:

Difference between the first maximum +ve number and second maximum +ve number

NOTE:

Since the positive and the negative number representation are symmetric, so, the difference between the first maximum –ve number and the second maximum –ve number is also $2^{22}$ . So, it is okay to analyze only the number representation with +ve numbers.

First minimum +ve number:

Remember,

as a generic representation of value expression is

Second minimum +ve number:

Difference between the first minimum and second minimum is

Conclusion:

(i) The difference between first and second maximum on the below format is = $2^{22}$

(ii) The difference between first and second

(i) (Diff) first and second max >>> (Diff) first and second minima

→ (gap) first and second max is maximum,

→ error is maximum between first and second maximum.

(ii) Maximum error will occur if some number lies exactly halfway between 1st max, 2nd max since the maximum error occurs when approximation in representation is done when deviation from both sides will be equal.

(iii) Minimum error is possible between 1st min and 2nd min.

(iv) Number system is clustered towards 0

i.e. the number representation is dense towards ‘0’ and sparse away from‘0’.

Solution (C):

Bias = 32

So, Error = 0.125

Example 2:

Consider the IBM system which allocated 32 bits for floating-point number. The base of the system is 16. Mantissa is in normalized sign-magnitude form. Exponent is in excess-128 formats. Express the largest representable value as powers of a number system with base 11.

Solution:

Excess -128 format is required, which means