Mojo function

buffer_load

buffer_load[dtype: DType, width: Int](src_resource: SIMD[uint32, 4], gds_offset: SIMD[int32, 1]) -> SIMD[dtype, width]

Loads data from global memory into a SIMD register.

This function provides a hardware-accelerated global memory load operation that maps directly to the AMDGPU buffer_load instruction. It efficiently transfers data from global memory to registers.

Note:

Only supported on AMD GPUs.
Uses non-glc loads by default (can hit L1 cache and persist across wavefronts).
Supports widths that map to 1, 2, 4, 8, or 16 byte loads.
Maps directly to llvm.amdgcn.raw.buffer.load intrinsics.

Parameters:

dtype (DType): The data type to load.
width (Int): The SIMD vector width for vectorized loads.

Args:

src_resource (SIMD[uint32, 4]): Buffer resource descriptor created by make_buffer_resource().
gds_offset (SIMD[int32, 1]): Offset in elements (not bytes) from the base address in the resource.

Returns:

SIMD vector containing the loaded data.