User's Guide
The PowerPC floating-point hardware performs calculations in either IEEE
single-precision (equivalent to REAL(4) in Fortran programs) or IEEE
double-precision (equivalent to REAL(8) in Fortran
programs).
Keep the following considerations in mind:
- Double precision provides greater range (approximately 10**(-308) to
10**308) and precision (about 15 decimal digits) than single precision
(approximate range 10**(-38) to 10**38, with about 7 decimal digits of
precision).
- Computations that mix single and double operands are performed in double
precision, which requires conversion of the single-precision operands to
double-precision. These conversions do not affect performance.
- Double-precision values that are converted to single-precision (such as
when you specify the SNGL intrinsic or when a double-precision
computation result is stored into a single-precision variable) require
rounding operations. A rounding operation produces the correct
single-precision value which is based on the IEEE rounding mode in effect. The value may be less precise than the original double-precision
value, as a result of rounding error. Conversions from double-precision
values to single-precision values may reduce the performance of your
code.
- Programs that manipulate large amounts of floating-point data may run
faster if they use REAL(4) rather than REAL(8)
variables. (You need to ensure that REAL(4) variables provide
you with acceptable range and precision.) The programs may run faster
because the smaller data size reduces memory traffic, which can be a
performance bottleneck for some applications.
The floating-point hardware also provides a special set of double-precision
operations that multiply two numbers and add a third number to the
product. These combined multiply-add (MAF) operations are
performed at the same speed as either a multiply or an add operation alone is
performed. The MAF functions provide an extension to the IEEE
standard because they perform the multiply and add with one (rather than two)
rounding errors. The MAF functions are faster and more
accurate than the equivalent separate operations.
XL Fortran extended precision is not in the format suggested by the IEEE
standard, which suggests extended formats using more bits in both the exponent
(for greater range) and the fraction (for greater precision).
XL Fortran extended precision, equivalent to REAL(16) in Fortran
programs, is implemented in software. Extended precision provides the
same range as double precision (about 10**(-308) to 10**308), but more
precision (a variable amount, about 31 decimal digits or more). The
software support is restricted to round-to-nearest mode. Programs that
use extended precision must ensure that this rounding mode is in effect when
extended-precision calculations are performed. See Selecting the Rounding Mode for the different ways you can control the rounding
mode.
Programs that specify extended-precision values as hexadecimal, octal,
binary, or Hollerith constants must follow these conventions:
- Extended-precision numbers are composed of two double-precision numbers
with different magnitudes that do not overlap. That is, the binary
exponents differ by at least the number of fraction bits in a
REAL(8). The high-order double-precision value (the one that
comes first in storage) must have the larger magnitude. The value of
the extended-precision number is the sum of the two double-precision
values.
- For a value of NaN or infinity, you must encode one of these values within the high-order double-precision value. The low-order value is not significant.
Because an XL Fortran extended-precision value can be the sum of two values
with greatly different exponents, leaving a number of assumed zeros in the
fraction, the format actually has a variable precision, with a minimum of
about 31 decimal digits. You get more precision in cases where the
exponents of the two double values differ in magnitude by more than the number
of digits in a double-precision value. This encoding allows an
efficient implementation intended for applications requiring more precision
but no more range than double precision.
Notes:
- In the discussions of rounding errors because of compile-time folding of
expressions, keep in mind that this folding produces different results for
extended-precision values more often than for other precisions.
- Special numbers, such as NaN and infinity, are not fully supported for
extended-precision values. Arithmetic operations do not necessarily
propagate these numbers in extended precision.
- XL Fortran does not always detect floating-point exception conditions (see
Detecting and Trapping Floating-Point Exceptions) for extended-precision values. If you turn on
floating-point exception trapping in programs that use extended precision, XL
Fortran may also generate signals in cases where an exception condition does
not really occur.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
© Copyright IBM Corporation 1990, 1998.