For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo package
structured_kernels
Shared GPU kernel primitives for structured kernel architectures.
This package provides architecture-agnostic building blocks used by SM90, SM100, and other GPU kernel implementations:
- pipeline: Producer-consumer pipeline synchronization
- pipeline_storage: Barrier pair storage and pipeline factory
- tile_types: TileTensor-based shared memory tile abstractions
- kernel_common: Warp role dispatch and kernel context
- barriers: Composable barrier storage for SMEM structs
Modulesβ
- β
amd_tile_io: TileTensor data movement and AMD GPU hardware operations. - β
amd_tile_io_conv: DRAM->LDS DMA loader for AMD implicit-GEMM convolution. - β
barriers: Barrier abstractions for SM100 structured matmul kernels. - β
kernel_common: Shared kernel components for SM100 warp-specialized matmul kernels. - β
pipeline: Producer-consumer pipeline utilities for SM100 structured kernels. - β
pipeline_storage: Unified Pipeline Storage Framework for SM100 Structured Kernels. - β
smem_types: Shared memory type aliases for LayoutTensor-based GPU kernels. - β
tile_types: Native TileTensor types for SM100 structured kernels. - β
trace_buf: Zero-overhead per-CTA trace buffer for GPU kernel instrumentation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!