Mojo struct
AMDBufferResource
@register_passable(trivial)
struct AMDBufferResource
Fields
- desc (
SIMD[DType.uint32, 4]
): 128-bit descriptor for a buffer resource on AMD GPUs. Used for buffer_load/buffer_store instructions.
Implemented traits
AnyType
,
Copyable
,
ImplicitlyCopyable
,
Movable
,
UnknownDestructibility
Aliases
__copyinit__is_trivial
alias __copyinit__is_trivial = SIMD[DType.uint32, 4].__copyinit__is_trivial
__del__is_trivial
alias __del__is_trivial = SIMD[DType.uint32, 4].__del__is_trivial
__moveinit__is_trivial
alias __moveinit__is_trivial = SIMD[DType.uint32, 4].__moveinit__is_trivial
Methods
__init__
__init__[dtype: DType](gds_ptr: UnsafePointer[Scalar[dtype], address_space=address_space, mut=mut, origin=origin], num_records: Int = Int.__init__[UInt32](SIMD[DType.uint32, 1](max_or_inf[DType.uint32]()))) -> Self
__init__() -> Self
get_base_ptr
load
load[dtype: DType, width: Int, *, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, *, scalar_offset: Int32 = 0) -> SIMD[dtype, width]
Returns:
load_to_lds
load_to_lds[dtype: DType, *, width: Int = 1, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, shared_ptr: UnsafePointer[Scalar[dtype], address_space=AddressSpace(3)], *, scalar_offset: Int32 = 0)
Loads data from global memory and stores to shared memory.
Copies from global memory to shared memory (aka LDS) bypassing storing to register.
Parameters:
- dtype (
DType
): The dtype of the data to be loaded. - width (
Int
): The SIMD vector width. - cache_policy (
CacheOperation
): Cache operation policy controlling cache behavior at all levels.
Args:
- vector_offset (
Int32
): Vector memory offset in elements (per thread). - shared_ptr (
UnsafePointer
): Shared memory address. - scalar_offset (
Int32
): Scalar memory offset in elements (shared across wave).
store
store[dtype: DType, width: Int, *, cache_policy: CacheOperation = CacheOperation(0)](self, vector_offset: Int32, val: SIMD[dtype, width], *, scalar_offset: Int32 = 0)
Stores a register variable to global memory with cache operation control.
Writes to global memory from a register with high-level cache control.
Note:
- Only supported on AMD GPUs.
- Provides high-level cache control via CacheOperation enum values.
- Maps directly to llvm.amdgcn.raw.buffer.store intrinsics.
- Cache control bits:
- SC[1:0] controls coherency scope: 0=wave, 1=group, 2=device, 3=system.
- nt=True: Use streaming-optimized cache policies (recommended for streaming data).
Parameters:
- dtype (
DType
): The data type. - width (
Int
): The SIMD vector width. - cache_policy (
CacheOperation
): Cache operation policy controlling cache behavior at all levels.
Args:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!