Mojo module
grouped_tile_scheduler
Grouped tile scheduler for SM100 structured block-scaled GEMM.
This scheduler extends the SM100 TileScheduler to support grouped GEMM with variable problem sizes per group. It uses linear tile iteration instead of CLC (Cluster Launch Control) to map a global linear tile index to group-specific coordinates.
Key features:
- GroupedWorkInfo: Extends WorkInfo with group_idx, k_tile_count, group_changed
- delinearize_to_group(): Maps linear tile index to group + local coordinates
- Supports variable M, N, K per group
- Compatible with dynamic tensormap updates
Usage: var scheduler = GroupedTileScheduler[...](problem_sizes, tile_shape) var work_iter = scheduler.work_iterator() for current in work_iter: if current.group_changed: update_tensormaps(current.group_idx) process_tile(current)
Structsβ
- β
GroupedCLCSchedulerIterator: Scheduler warp iterator for grouped GEMM with CLC. - β
GroupedCLCWorkIterator: Per-warp work iterator for grouped GEMM with CLC barrier support. - β
GroupedTileScheduler: Tile scheduler for grouped block-scaled GEMM. - β
GroupedWorkInfo: Work info for grouped GEMM with group-specific metadata. - β
GroupedWorkIterator: Per-warp work iterator for grouped GEMM using next-style iteration.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!