Mojo module
fp8_utils
Shared FP8 quantization utilities.
Provides common functions for FP8 scale computation and quantization used across fused normalization kernels and standalone quantization kernels.
Functionsβ
- β
compute_dynamic_fp8_scale: Compute dynamic FP8 scale factor and its reciprocal from a row max. - β
compute_static_fp8_scale_recip: Compute reciprocal scale for static FP8 quantization. - β
fp8_quantize: Quantize values to FP8, optionally clamping to the representable range.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!