Skip to main content

Mojo module

tile_loader

TMA tile loader for SM100 matrix multiplication.

Provides a wrapper around TMA async_multicast_load operations, following the SM90 TileLoaderTMA pattern. Orchestration logic (k-group iteration, expect_bytes, barrier management) is handled by the kernel, not the loader.

Usage: # In kernel - create separate A and B loaders var a_loader = ATileLoaderType(Pointer(to=a_tma_op), ctx.a_multicast_mask) var b_loader = BTileLoaderType(Pointer(to=b_tma_op), ctx.b_multicast_mask)

# Load tiles using the loaders
a_loader.load(a_tile, barrier, k_coord, m_coord)
b_loader.load(b_tile, barrier, k_coord, n_coord)

Structs

Was this page helpful?