Microscaling
Documentation for Microscaling.
Microscaling.MXFP4Microscaling.MXFP6_E2M3Microscaling.MXFP6_E3M2Microscaling.MXFP8_E4M3Microscaling.MXFP8_E5M2Microscaling.MXINT8Microscaling.NVFP4Microscaling.BlockFormatMicroscaling.quantize
Microscaling.MXFP4 — ConstantMXFP4MXFP4 is a microscaling format using FP4 elements (E2M1, no NaN/Inf), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.MXFP6_E2M3 — ConstantMXFP6_E2M3MXFP6_E2M3 is a microscaling format using FP6 elements (E2M3, no NaN/Inf), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.MXFP6_E3M2 — ConstantMXFP6_E3M2MXFP6_E3M2 is a microscaling format using FP6 elements (E3M2, no NaN/Inf), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.MXFP8_E4M3 — ConstantMXFP8_E4M3MXFP8_E4M3 is a microscaling format using FP8 elements (E4M3, including NaN), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.MXFP8_E5M2 — ConstantMXFP8_E5M2MXFP8_E5M2 is a microscaling format using FP8 elements (E5M2, including NaN/Inf), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.MXINT8 — ConstantMXINT8MXINT8 is a microscaling format using INT8 elements (Int8, scaled down by 64), with E8M0 scale factors, each scaling contiguous element blocks of 32.
Microscaling.NVFP4 — ConstantNVFP4NVFP4 is a microscaling format using FP4 elements (E2M1, no NaN/Inf), with E4M3 scale factors, each scaling contiguous element blocks of 16.
Microscaling.BlockFormat — TypeBlockFormat{E,S,k}A block format specifies the element type, scale type, and number of elements per block.
Microscaling.quantize — Methodquantize(V::AbstractArray, format::BlockFormat{E,S,k}; method=GenericMethod(), axis=:column)Quantize the input array V to the given block format format.
Arguments
V::AbstractArray: The input array to quantize.format::BlockFormat{E,S,k}: The block format to quantize to.method::Method: The method to use for quantization.axis::Symbol: The axis to quantize along. Must be:columnor:row.
If :row, the first two dimensions are transposed such that the blocks are contiguous along the first dimension.
Returns
X::AbstractArray: The scale factors.P::AbstractArray: The quantized values.