Skip to main content

Mojo module

grouped_1d1d_tile_scheduler

Work scheduler for grouped 1D-1D block-scaled SM100 matmul.

Provides work iteration using offset-based addressing for the 1D-1D tensor layout. This is a port of the TileScheduler from grouped_matmul_tile_scheduler.mojo to the structured kernels architecture with context manager patterns.

Key characteristics:

  • Uses a_offsets tensor for group boundaries (prefix sum of token counts)
  • Each iteration returns (m_coord, n_coord, expert_id, expert_scale)
  • Supports block swizzling for L2 cache efficiency
  • 3-warp specialization (no scheduler warp)

Structs​

Was this page helpful?