Mojo module
blockwise_fp8_1d2d_matmul
CPU entrypoint for grouped 1D-1D blockwise FP8 SM100 matmul.
This module provides the public API for launching the grouped 1D-1D blockwise FP8 matmul kernel for Mixture of Experts (MoE) layers.
Usage: grouped_matmul_1d2d_blockwise_fp8[transpose_b=True, config=config]( c_tensor, a_tensor, b_tensor, a_scales, b_scales, a_offsets, expert_ids, expert_scales, num_active_experts, ctx, )
Functionsโ
- โ
grouped_matmul_1d2d_blockwise_fp8: - โ
grouped_matmul_dynamic_scaled_fp8_1d2d: Compatibility wrapper that matches the existing dispatch API.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!