Skip to main content

Mojo function

ds_read_tr8_b64

ds_read_tr8_b64[dtype: DType, //](shared_ptr: UnsafePointer[Scalar[dtype], shared_ptr.origin, address_space=AddressSpace.SHARED]) -> SIMD[dtype, 8]

Reads a 64-bit LDS transpose block using TR8 layout and returns SIMD[dtype, 8] of 8-bit types.

Each 16-lane row reads 16x8 bytes from LDS and performs two interleaved 8x8 byte transposes, producing 8 transposed bytes per lane.

Notes:

  • Only supported on AMD GPUs (CDNA4+).
  • Maps directly to llvm.amdgcn.ds.read.tr8.b64 intrinsic.
  • Return type must use v2i32 intermediate to avoid LLVM type legalizer crash.

Parameters:

  • dtype (DType): Data type of the elements (must be 8-bit type).

Args:

  • shared_ptr (UnsafePointer): Pointer to the LDS transpose block.

Returns:

SIMD: SIMD[dtype, 8] of 8-bit types.

Was this page helpful?