Skip to main content

Mojo module

rms_norm_fp8

Fused RMSNorm + FP8 quantization kernel.

Provides the fused RMSNorm + FP8 quantization primitive used by both the standalone normalization layer and the fused allreduce + RMSNorm + FP8 kernel. Lives in comm/ so that allreduce_residual_rmsnorm_fp8 can depend on it without introducing a comm → nn → comm circular dependency.

Functions

Was this page helpful?