Skip to main content

Mojo module

blockwise_fp8_matmul

CPU entry points for blockwise FP8 SM100 matmul.

Creates TMA descriptors for A, B, C and A-scales, then launches the warp-specialized blockwise FP8 kernel with register-based accumulation.

Functions

Was this page helpful?