Conversion
BFloat16
Conversion to and from Microfloat uses BFloat16 as an intermediate type, since BFloat16 has 1 sign bit, 8 exponent bits, and 7 significand (mantissa) bits, and is therefore able to represent all Microfloat types.
Rounding
Converting from larger types will round to the nearest even value, i.e. the value whose bit representation ends in 0.
Overflow policies
When converting from a wider type to a Microfloat, one may want certain behaviors in regard to Inf and NaN handling.
| Source Value | Destination Value | |||||
|---|---|---|---|---|---|---|
| Has Inf+NaN | Has NaN | Finite | ||||
| SAT | OVF | SAT | OVF | SAT | OVF | |
| NaN | NaN | NaN | NaN | NaN | Error | Error |
| ±Inf | ±floatmax | ±Inf | ±floatmax | NaN | ±floatmax | Error |
| >|floatmax| | ±floatmax | ±Inf | ±floatmax | NaN | ±floatmax | Error |
Microfloats.OVF — Type
OVFsourceMicrofloats.SAT — Type
SATsource