Mojo module
grouped_tile_scheduler
Grouped tile scheduler for SM100 structured block-scaled GEMM.
This scheduler extends the SM100 TileScheduler to support grouped GEMM with variable problem sizes per group. It uses linear tile iteration instead of CLC (Cluster Launch Control) to map a global linear tile index to group-specific coordinates.
Key features:
- GroupedWorkInfo: Extends WorkInfo with group_idx, k_tile_count, group_changed
- delinearize_to_group(): Maps linear tile index to group + local coordinates
- Supports variable M, N, K per group
- Compatible with dynamic tensormap updates
Usage: var scheduler = GroupedTileScheduler[...](problem_sizes, tile_shape) var work_iter = scheduler.work_iterator() while work_iter.has_work(): with work_iter.next() as current: if current.group_changed: update_tensormaps(current.group_idx) process_tile(current)
Structs
-
GroupedAdvanceContext: Context manager that returns current work and advances on exit. -
GroupedCLCSchedulerIterator: Scheduler warp iterator for grouped GEMM with CLC. -
GroupedCLCWaitAndAdvanceContext: Context for waiting on CLC barrier and advancing work iterator. -
GroupedCLCWorkIterator: Per-warp work iterator for grouped GEMM with CLC barrier support. -
GroupedTileScheduler: Tile scheduler for grouped block-scaled GEMM. -
GroupedWorkInfo: Work info for grouped GEMM with group-specific metadata. -
GroupedWorkIterator: Per-warp work iterator for grouped GEMM.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!