For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
ds_read_tr16_b64_warp
ds_read_tr16_b64_warp[mma_shape: IndexList[3]](tile: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=tile.linear_idx_type, element_size=tile.element_size]) -> SIMD[tile.dtype, 4]
Warp-level transposed LDS read distributing across 16-lane rows.
For 32x32x16 MMA: 2x2 row distribution over 8x32 tile. For 16x16x32 MMA: 4x1 row distribution over 16x16 tile.
Parameters:
- mma_shape (
IndexList[3]): MMA instruction shape (M, N, K).
Args:
- tile (
TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=tile.linear_idx_type, element_size=tile.element_size]): A TileTensor in shared memory sized for the MMA shape.
Returns:
SIMD[tile.dtype, 4]: A SIMD[dtype, 4] vector with transposed data for one lane.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!