Mojo package
sm100_structured
Provides the Nvidia Blackwell backend implementations for matmuls as Structured Kernels.
Modules
-
barriers: Barrier abstractions for SM100 structured matmul kernels. -
block_scaled_matmul: CPU entry points for block-scaled SM100 matmul. -
block_scaled_matmul_kernel: Block-scaled SM100 matmul kernel - Structured kernel using tile pipelines. -
block_scaled_output_writer: BlockScaledTileWriter for SM100 block-scaled matmul output pipeline. -
block_scaled_smem: Shared memory layout for block-scaled SM100 matmul. -
matmul: SM100 Matmul CPU entry points - TMA setup and kernel launch wrappers. -
matmul_kernels: SM100 Matmul Kernel Structs - GPU kernel entry points and helpers. -
output_writer: TileWriter for SM100 matmul output pipeline. -
pipeline: Re-export ProducerConsumerPipeline from legacy sm100 module. -
tile_loader: TMA tile loader for SM100 matrix multiplication. -
tile_pipeline: Tile pipeline for SM100 producer-consumer synchronization. -
tile_scheduler: -
tile_scheduler_splitk: -
tile_writer: TileWriter components for SM100 matrix multiplication epilogue. -
tmem: Tensor Memory (TMEM) abstractions for SM100 Blackwell GPUs. -
warp_context: RAII warp context managers for SM100 matmul kernel.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!