bfloat16 floating-point format

The bfloat16 floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a truncated (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning. It supports only a 7-bit precision rather than the 23-bit significand of the binary32 format.

The bfloat16 format is utilized in upcoming Intel AI processors, such as Nervana NNP-L1000, Xeon processors, and Intel FPGAs,^[1]^[2]^[3] Google Cloud TPUs,^[4]^[5]^[6] and Tensorflow.^[6]^[7]

bfloat16 floating-point format

bfloat16 has the following format:

Sign bit: 1 bit
Exponent width: 8 bits
Significand precision: 7 bits (as opposed to 23 bits in a binary32)

The bfloat16 format, being a truncated IEEE 754 single-precision 32-bit float, allows for fast conversion to, and from, an IEEE 754 single-precision 32-bit float, and preserves the exponent bits while reducing the significand. Preserving the exponent bits maintains the 32-bit float's range of ~1e-38 to ~3e38.^[8]

The bits are laid out as follows:

Contrast to a IEEE 754 single-precision 32-bit float:

Zero

Denormalized numbers

Representation of non-numbers

Positive and negative infinity

NaN

Range and precision

Rounding modes

Examples

References

^ Khari Johnson (2018-05-23). "Intel unveils Nervana Neural Net L-1000 for accelerated AI training". VentureBeat. Retrieved 2018-05-23. ...Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs.
^ Michael Feldman (2018-05-23). "Intel Lays Out New Roadmap for AI Portfolio". TOP500 Supercomputer Sites. Retrieved 2018-05-23. Intel plans to support this format across all their AI products, including the Xeon and FPGA lines
^ Lucian Armasu (2018-05-23). "Intel To Launch Spring Crest, Its First Neural Network Processor, In 2019". Tom's Hardware. Retrieved 2018-05-23. Intel said that the NNP-L1000 would also support bfloat16, a numerical format that's being adopted by all the ML industry players for neural networks. The company will also support bfloat16 in its FPGAs, Xeons, and other ML products. The Nervana NNP-L1000 is scheduled for release in 2019.
^ "Available TensorFlow Ops | Cloud TPU | Google Cloud". Google Cloud. Retrieved 2018-05-23. This page lists the TensorFlow Python APIs and graph operators available on Cloud TPU.
^ Elmar Haußmann (2018-04-26). "Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50". RiseML Blog. Retrieved 2018-05-23. For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision.
^ ^a ^b Tensorflow Authors (2018-02-28). "ResNet-50 using BFloat16 on TPU". Google. Retrieved 2018-05-23.
^ Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous (2017-11-28). TensorFlow Distributions (Report). arXiv:1711.10604. Bibcode:2017arXiv171110604D. Accessed 2018-05-23. All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts{{cite report}}: CS1 maint: multiple names: authors list (link)
^ "Livestream Day 1: Stage 8 (Google I/O '18) - YouTube". Google. 2018-05-08. Retrieved 2018-05-23. In many models this is a drop-in replacement for float-32

This computing article is a stub. You can help Wikipedia by expanding it.

[vent_Inte-1] Khari Johnson (2018-05-23). "Intel unveils Nervana Neural Net L-1000 for accelerated AI training". VentureBeat. Retrieved 2018-05-23. ...Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs.

[top5_Inte-2] Michael Feldman (2018-05-23). "Intel Lays Out New Roadmap for AI Portfolio". TOP500 Supercomputer Sites. Retrieved 2018-05-23. Intel plans to support this format across all their AI products, including the Xeon and FPGA lines

[toms_Inte-3] Lucian Armasu (2018-05-23). "Intel To Launch Spring Crest, Its First Neural Network Processor, In 2019". Tom's Hardware. Retrieved 2018-05-23. Intel said that the NNP-L1000 would also support bfloat16, a numerical format that's being adopted by all the ML industry players for neural networks. The company will also support bfloat16 in its FPGAs, Xeons, and other ML products. The Nervana NNP-L1000 is scheduled for release in 2019.

[clou_Avai-4] "Available TensorFlow Ops | Cloud TPU | Google Cloud". Google Cloud. Retrieved 2018-05-23. This page lists the TensorFlow Python APIs and graph operators available on Cloud TPU.

[blog_Comp-5] Elmar Haußmann (2018-04-26). "Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50". RiseML Blog. Retrieved 2018-05-23. For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision.

[gith_tens-6] Tensorflow Authors (2018-02-28). "ResNet-50 using BFloat16 on TPU". Google. Retrieved 2018-05-23.

[arxiv_1711.10604-7] Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous (2017-11-28). TensorFlow Distributions (Report). arXiv:1711.10604. Bibcode:2017arXiv171110604D. Accessed 2018-05-23. All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts{{cite report}}: CS1 maint: multiple names: authors list (link)

[googleio18-day1-time2575-8] "Livestream Day 1: Stage 8 (Google I/O '18) - YouTube". Google. 2018-05-08. Retrieved 2018-05-23. In many models this is a drop-in replacement for float-32

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

v t e Data types
Uninterpreted	Bit Byte Trit Tryte Word Bit array
Numeric	Arbitrary-precision or bignum Complex Decimal Fixed point Block floating point Floating point Reduced precision Minifloat Half precision bfloat16 Single precision Double precision Quadruple precision Octuple precision Extended precision Long double Integer signedness Interval Rational
Pointer	Address physical virtual Reference
Text	Character String null-terminated
Composite	Algebraic data type generalized Array Associative array Class Dependent Equality Inductive Intersection List Object metaobject Option type Product Record or Struct Refinement Set Union tagged
Other	Boolean Bottom type Collection Enumerated type Exception Function type Opaque data type Recursive data type Semaphore Stream Strongly typed identifier Top type Type class Empty type Unit type Void
Related topics	Abstract data type Boxing Data structure Generic Kind metaclass Parametric polymorphism Primitive data type Interface Subtyping Type constructor Type conversion Type system Type theory Variable