Mojo module

blockwise_fp8_1d2d_matmul

CPU entrypoint for grouped 1D-1D blockwise FP8 SM100 matmul.

This module provides the public API for launching the grouped 1D-1D blockwise FP8 matmul kernel for Mixture of Experts (MoE) layers.

Usage: grouped_matmul_1d2d_blockwise_fp8[transpose_b=True, config=config]( c_tensor, a_tensor, b_tensor, a_scales, b_scales, a_offsets, expert_ids, expert_scales, num_active_experts, ctx, )

Functions

grouped_matmul_1d2d_blockwise_fp8:
grouped_matmul_dynamic_scaled_fp8_1d2d: Compatibility wrapper that matches the existing dispatch API.

Functions​

Functions