For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
ds_read_tr16_b64_row
ds_read_tr16_b64_row(tile: TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=tile.linear_idx_type, element_size=tile.element_size]) -> SIMD[tile.dtype, 4]
4x16 transposed LDS read via rocdl.ds.read.tr16.b64.
Each 16-lane "row" loads a 4x16 tile, with per-lane exchange so each lane gets a column of the tile as SIMD[dtype, 4].
Args:
- tile (
TileTensor[address_space=AddressSpace.SHARED, linear_idx_type=tile.linear_idx_type, element_size=tile.element_size]): A 4x16 TileTensor in shared memory (2-byte element type).
Returns:
SIMD[tile.dtype, 4]: A SIMD[dtype, 4] vector with one column of the transposed tile.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!