Mojo module
block_scaled_matmul
CPU entry points for block-scaled SM100 matmul.
Creates TMA descriptors for A, B, C and scaling factors (SFA, SFB), then launches the warp-specialized kernel.
comptime valuesβ
UnsafePointerβ
comptime UnsafePointer = LegacyUnsafePointer[?, address_space=?, origin=?]
Functionsβ
- β
blackwell_block_scaled_matmul_tma_umma_warp_specialized: Launch block-scaled FP8 matmul kernel on SM100.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!