Mojo module
blockwise_fp8_1d2d_matmul
CPU entrypoint for grouped 1D2D blockwise FP8 SM100 matmul.
This module provides the public API for launching the grouped 1D2D blockwise FP8 matmul kernel for Mixture of Experts (MoE) layers.
Usage: grouped_matmul_1d2d_blockwise_fp8[transpose_b=True, config=config]( c_tensor, a_tensor, b_tensor, a_scales, b_scales, a_offsets, expert_ids, expert_scales, num_active_experts, ctx, )
Functionsโ
- โ
grouped_matmul_1d2d_blockwise_fp8: - โ
grouped_matmul_dynamic_scaled_fp8_1d2d: Compatibility wrapper that matches the existing dispatch API.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!