IEEE 754 : 2008
IEEE 754 : 2008
FLOATING-POINT ARITHMETIC
Institute of Electrical & Electronics Engineers
FLOATING-POINT ARITHMETIC
Institute of Electrical & Electronics Engineers
1 Overview
1.1 Scope
1.2 Purpose
1.3 Inclusions
1.4 Exclusions
1.5 Programming environment considerations
1.6 Word usage
2 Definitions, abbreviations, and acronyms
2.1 Definitions
2.2 Abbreviations and acronyms
3 Floating-point formats
3.1 Overview
3.2 Specification levels
3.3 Sets of floating-point data
3.4 Binary interchange format encodings
3.5 Decimal interchange format encodings
3.6 Interchange format parameters
3.7 Extended and extendable precisions
4 Attributes and rounding
4.1 Attribute specification
4.2 Dynamic modes for attributes
4.3 Rounding-direction attributes
5 Operations
5.1 Overview
5.2 Decimal exponent calculation
5.3 Homogeneous general-computational operations
5.4 formatOf general-computational operations
5.5 Quiet-computational operations
5.6 Signaling-computational operations
5.7 Non-computational operations
5.8 Details of conversions from floating-point to integer formats
5.9 Details of operations to round a floating-point datum
to integral value
5.10 Details of totalOrder predicate
5.11 Details of comparison predicates
5.12 Details of conversion between floating-point data and
external character sequences
6 Infinity, NaNs, and sign bit
6.1 Infinity arithmetic
6.2 Operations with NaNs
6.3 The sign bit
7 Default exception handling
7.1 Overview: exceptions and flags
7.2 Invalid operation
7.3 Division by zero
7.4 Overflow
7.5 Underflow
7.6 Inexact
8 Alternate exception handling attributes
8.1 Overview
8.2 Resuming alternate exception handling attributes
8.3 Immediate and delayed alternate exception handling
attributes
9 Recommended operations
9.1 Conforming language- and implementation-defined functions
9.2 Recommended correctly rounded functions
9.3 Operations on dynamic modes for attributes
9.4 Reduction operations
10 Expression evaluation
10.1 Expression evaluation rules
10.2 Assignments, parameters, and function values
10.3 preferredWidth attributes for expression evaluation
10.4 Literal meaning and value-changing optimizations
11 Reproducible floating-point results
Annex A (informative) Bibliography
Annex B (informative) Program debugging support
Index of operations
Describes formats and methods for floating-point arithmetic in computer systems - standard and extended functions with single, double, extended, and extendable precision - and recommends formats for data interchange. Exception conditions are defined and standard handling of these conditions is specified.
Document Type | Standard |
Status | Current |
Publisher | Institute of Electrical & Electronics Engineers |