Mojo module
layout_tensor
Provides the LayoutTensor type for representing multidimensional data.
Aliases
binary_op_type
alias binary_op_type = fn[dtype: DType, width: Int](lhs: SIMD[dtype, width], rhs: SIMD[dtype, width]) -> SIMD[dtype, width]
Type alias for binary operations on SIMD vectors.
This type represents a function that takes two SIMD vectors of the same type and width and returns a SIMD vector of the same type and width.
Args: dtype: The data type of the SIMD vector elements. width: The width of the SIMD vector. lhs: Left-hand side SIMD vector operand. rhs: Right-hand side SIMD vector operand.
Returns: A SIMD vector containing the result of the binary operation.
Structs
- 
LayoutTensor: A high-performance tensor with explicit memory layout and hardware-optimized access patterns. - 
LayoutTensorIter: Iterator for traversing a memory buffer with a specific layout. - 
ThreadScope: Represents the scope of thread operations in GPU programming. 
Functions
- 
copy_dram_to_local: Efficiently copy data from global memory (DRAM) to registers for AMD GPUs. - 
copy_dram_to_sram: Synchronously copy data from DRAM (global memory) to SRAM (shared memory) in a GPU context. - 
copy_dram_to_sram_async: Asynchronously copy data from DRAM (global memory) to SRAM (shared memory) in a GPU context. - 
copy_local_to_dram: Efficiently copy data from registers (LOCAL) to global memory (DRAM). - 
copy_local_to_local: Synchronously copy data between local memory (register) tensors with type conversion. - 
copy_local_to_shared: Synchronously copy data from local memory (registers) to SRAM (shared memory). - 
copy_sram_to_dram: Synchronously copy data from SRAM (shared memory) to DRAM (global memory). - 
copy_sram_to_local: Synchronously copy data from SRAM (shared memory) to local memory. - 
cp_async_k_major: Asynchronously copy data from DRAM to SRAM using TMA (Tensor Memory Accelerator) with K-major layout. - 
cp_async_mn_major: Asynchronously copy data from DRAM to SRAM using TMA (Tensor Memory Accelerator) with MN-major layout. - 
stack_allocation_like: Create a stack-allocated tensor with the same layout as an existing tensor. 
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!