Conversion
BFloat16
Conversion to and from Microfloat
uses BFloat16
as an intermediate type, since BFloat16 has 1 sign bit, 8 exponent bits, and 7 significand (mantissa) bits, and is therefore able to represent all Microfloat
types.
Rounding
Converting from larger types will round to the nearest even value, i.e. the value whose bit representation ends in 0.
Overflow policies
When converting from a wider type to a Microfloat
, one may want certain behaviors in regard to Inf and NaN handling.
Source Value | Destination Value | |||||
---|---|---|---|---|---|---|
Has Inf+NaN | Has NaN | Finite | ||||
SAT | OVF | SAT | OVF | SAT | OVF | |
NaN | NaN | NaN | NaN | NaN | Error | Error |
±Inf | ±floatmax | ±Inf | ±floatmax | NaN | ±floatmax | Error |
>|floatmax| | ±floatmax | ±Inf | ±floatmax | NaN | ±floatmax | Error |
Microfloats.OVF
— TypeOVF
Microfloats.SAT
— TypeSAT