Skip to main content

Mojo module

grouped_1d1d_matmul

CPU entrypoint for grouped 1D-1D block-scaled SM100 matmul.

This module provides the public API for launching the grouped 1D-1D matmul kernel for Mixture of Experts (MoE) layers.

Usage: grouped_matmul_1d1d_nvfp4[transpose_b=True, config=config]( c_tensor, # Output: TileTensor (total_tokens, N) a_tensor, # Input A: TileTensor (total_tokens, K) a_offsets, # Per-expert offsets: TileTensor 1D a_scale_offsets, # Per-expert scale offsets: TileTensor 1D b_tensor, # Weights B: TileTensor (num_experts, N, K) expert_ids, # Active expert IDs: TileTensor 1D a_scales, # Scale factors for A: TileTensor 5D b_scales, # Scale factors for B: TileTensor 6D expert_scales, # Per-expert output scaling: TileTensor 1D num_active_experts, ctx, )

Functionsโ€‹

Was this page helpful?