Buffer
Module
Implements the Buffer class.
Buffer
Defines a Buffer which can be parametrized on a static size and Dtype.
The Buffer does not own its underlying pointer.
Parameters:
- size (
Dim
): The static size (if known) of the Buffer. - type (
DType
): The element type of the Buffer.
Fields:
data
The underlying data pointer of the data.
dtype
The dynamic data type of the buffer.
dynamic_size
The dynamic size of the buffer.
Functions:
__init__
__init__() -> Self
Default initializer for Buffer. By default the fields are all inialized to 0.
Returns:
The NDBuffer object.
__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]) -> Self
Constructor for a Buffer with statically known size and type.
Constraints:
The size is known.
Args:
- ptr (
Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]
): Pointer to the data.
Returns:
The buffer object.
__init__(ptr: DTypePointer[type]) -> Self
Constructor for a Buffer with statically known size and type.
Constraints:
The size is known.
Args:
- ptr (
DTypePointer[type]
): Pointer to the data.
Returns:
The buffer object.
__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], in_size: Int) -> Self
Constructor for a Buffer with statically known type.
Constraints:
The size is unknown.
Args:
- ptr (
Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]
): Pointer to the data. - in_size (
Int
): Dynamic size of the buffer.
Returns:
The buffer object.
__init__(ptr: DTypePointer[type], in_size: Int) -> Self
Constructor for a Buffer with statically known type.
Constraints:
The size is unknown.
Args:
- ptr (
DTypePointer[type]
): Pointer to the data. - in_size (
Int
): Dynamic size of the buffer.
Returns:
The buffer object.
__init__(data: DTypePointer[type], dynamic_size: Int, dtype: DType) -> Self
__copyinit__
__copyinit__(existing: Self) -> Self
__getitem__
__getitem__(self: Self, idx: Int) -> SIMD[type, 1]
Loads a single element (SIMD of size 1) from the buffer at the specified index.
Args:
- idx (
Int
): The index into the Buffer.
Returns:
The value at the idx
position.
__setitem__
__setitem__(self: Self, idx: Int, val: scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>)
Stores a single value into the buffer at the specified index.
Args:
- idx (
Int
): The index into the Buffer. - val (
scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>
): The value to store.
__setitem__(self: Self, idx: Int, val: SIMD[type, 1])
Stores a single value into the buffer at the specified index.
Args:
- idx (
Int
): The index into the Buffer. - val (
SIMD[type, 1]
): The value to store.
__len__
__len__(self: Self) -> Int
Gets the size if it is a known constant, otherwise it gets the dynamic_size.
This method is used by Buffer.__len__
to get the size of the buffer. If the Buffer size is a known constant, then the size is returned. Otherwise, the dynamic_size is returned.
Returns:
The size if static otherwise dynamic_size.
aligned_simd_load
aligned_simd_load[width: Int, alignment: Int](self: Self, idx: Int) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
Int
): The index into the Buffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
aligned_simd_store
aligned_simd_store[width: Int, alignment: Int](self: Self, idx: Int, val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Parameters:
- width (
Int
): The width of the simd vector. - alignment (
Int
): The alignment value.
Args:
- idx (
Int
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
aligned_stack_allocation
aligned_stack_allocation[alignment: Int]() -> Self
Constructs a buffer instance backed by stack allocated memory space.
Parameters:
- alignment (
Int
): Address alignment requirement for the allocation.
Returns:
Constructed buffer with the allocated space.
bytecount
bytecount(self: Self) -> Int
Return the size of the Buffer in bytes.
Returns:
The size of the Buffer in bytes.
fill
fill(self: Self, val: SIMD[type, 1])
Assigns val to all elements in the Buffer.
The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.
Args:
- val (
SIMD[type, 1]
): The value to store.
prefetch
prefetch[params: PrefetchOptions](self: Self, idx: Int)
Prefetch the data at the given index.
Parameters:
- params (
PrefetchOptions
): The prefetch configuration.
Args:
- idx (
Int
): The index of the prefetched location.
simd_fill
simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])
Assigns val to all elements in chunks of size simd_width.
Parameters:
- simd_width (
Int
): The simd_width of the fill.
Args:
- val (
SIMD[type, 1]
): The value to store.
simd_load
simd_load[width: Int](self: Self, idx: Int) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Parameters:
- width (
Int
): The simd_width of the load.
Args:
- idx (
Int
): The index into the Buffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
simd_nt_store
simd_nt_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])
Stores a simd value using non-temporal store.
Constraints:
The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
Int
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_store
simd_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
Int
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
stack_allocation
stack_allocation() -> Self
Constructs a buffer instance backed by stack allocated memory space.
Returns:
Constructed buffer with the allocated space.
zero
zero(self: Self)
Set all bytes of the Buffer to 0.
DynamicRankBuffer
DynamicRankBuffer represents a buffer with unknown rank, shapes and dtype.
It is not as efficient as the statically ranked buffer, but is useful when interacting with external functions. In particular the shape is represented as a fixed (ie _MAX_RANK) array of dimensions to simplify the ABI.
Fields:
data
The pointer to the buffer.
rank
The buffer rank. Has a max value of _MAX_RANK
.
shape
The dynamic shape of the buffer.
type
The dynamic dtype of the buffer.
Functions:
__init__
__init__(data: DTypePointer[invalid], rank: Int, shape: StaticIntTuple[8], type: DType) -> Self
Construct DynamicRankBuffer.
Args:
- data (
DTypePointer[invalid]
): Pointer to the underlying data. - rank (
Int
): Rank of the buffer. - shape (
StaticIntTuple[8]
): Shapes of the buffer. - type (
DType
):dtype
of the buffer.
Returns:
Constructed DynamicRankBuffer.
dim
dim(self: Self, idx: Int) -> Int
Get given dimension.
Args:
- idx (
Int
): The dimension index.
Returns:
The buffer size on the given dimension.
get_shape
get_shape[rank: Int](self: Self) -> StaticIntTuple[rank]
Get a static tuple representing the buffer shape.
Parameters:
- rank (
Int
): Rank of the buffer.
Returns:
A static tuple of size ‘Rank’ filled with buffer shapes.
num_elements
num_elements(self: Self) -> Int
Get number of elements in the buffer.
Returns:
The number of elements in the buffer.
rank_dispatch
rank_dispatch[func: fn[Int]() capturing -> None](self: Self)
Dispatch the function call based on buffer rank.
Constraints:
Rank must be positive and less or equal to 8.
Parameters:
- func (
fn[Int]() capturing -> None
): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.
rank_dispatch[func: fn[Int]() capturing -> None](self: Self, out_chain: OutputChainPtr)
Dispatch the function call based on buffer rank.
Constraints:
Rank must be positive and less or equal to 8.
Parameters:
- func (
fn[Int]() capturing -> None
): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.
Args:
- out_chain (
OutputChainPtr
): The output chain.
to_buffer
to_buffer[type: DType](self: Self) -> Buffer[#pop.variant<:i1 0>, type]
Cast DynamicRankBuffer to Buffer.
Parameters:
- type (
DType
):dtype
of the buffer.
Returns:
Constructed Buffer.
to_ndbuffer
to_ndbuffer[rank: Int, type: DType](self: Self) -> NDBuffer[rank, create_unknown[rank](), type]
Cast the buffer to NDBuffer.
Constraints:
Rank of DynamicRankBuffer must equal rank of NDBuffer.
Parameters:
- rank (
Int
): Rank of the buffer. - type (
DType
):dtype
of the buffer.
Returns:
Constructed NDBuffer.
to_ndbuffer[rank: Int, type: DType](self: Self, stride: StaticIntTuple[rank]) -> NDBuffer[rank, create_unknown[rank](), type]
Cast the buffer to NDBuffer.
Constraints:
Rank of DynamicRankBuffer must equal rank of NDBuffer.
Parameters:
- rank (
Int
): Rank of the buffer. - type (
DType
):dtype
of the buffer.
Args:
- stride (
StaticIntTuple[rank]
): Strides of the buffer.
Returns:
Constructed NDBuffer.
NDBuffer
An N-dimensional Buffer.
NDBuffer can be parametrized on rank, static dimensions and Dtype. It does not own its underlying pointer.
Parameters:
- rank (
Int
): The rank of the buffer. - shape (
DimList
): The static size (if known) of the buffer. - type (
DType
): The element type of the buffer.
Fields:
data
The underlying data for the buffer. The pointer is not owned by the NDBuffer.
dynamic_dtype
The dynamic dtype.
dynamic_shape
The dynamic value of the shape.
dynamic_stride
The dynamic stride of the buffer.
is_contiguous
True if the contents of the buffer are contiguous in memory.
Functions:
__init__
__init__() -> Self
Default initializer for NDBuffer. By default the fields are all inialized to 0.
Returns:
The NDBuffer object.
__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]) -> Self
Constructor for NDBuffer with statically known rank, shapes and type.
Constraints:
The rank, shapes, and type are known.
Args:
- ptr (
Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]
): Pointer to the data.
Returns:
The NDBuffer object.
__init__(ptr: DTypePointer[type]) -> Self
Constructor for NDBuffer with statically known rank, shapes and type.
Constraints:
The rank, shapes, and type are known.
Args:
- ptr (
DTypePointer[type]
): Pointer to the data.
Returns:
The NDBuffer object.
__init__(ptr: pointer<scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>>, dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self
Constructor for NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
pointer<scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>>
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing shapes. - dynamic_dtype (
DType
): Dtype for the buffer.
Returns:
The NDBuffer object.
__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self
Constructor for NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing shapes. - dynamic_dtype (
DType
): Dtype for the buffer.
Returns:
The NDBuffer object.
__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self
Constructor for NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing shapes. - dynamic_dtype (
DType
): Dtype for the buffer.
Returns:
The NDBuffer object.
__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank]) -> Self
Constructor for strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing shapes. - dynamic_dtype (
DType
): Dtype for the buffer. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing strides.
Returns:
The NDBuffer object.
__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank]) -> Self
Constructor for strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing shapes. - dynamic_dtype (
DType
): Dtype for the buffer. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size ‘rank’ representing strides.
Returns:
The NDBuffer object.
__init__(data: DTypePointer[type], _rank: Int, dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank], is_contiguous: Bool) -> Self
__getitem__
__getitem__(self: Self, *idx: Int) -> SIMD[type, 1]
Get an element from the buffer from the specified index.
Args:
- idx (
*Int
): Index of the element to retrieve.
Returns:
The value of the element.
__getitem__(self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, 1]
Get an element from the buffer from the specified index.
Args:
- idx (
StaticIntTuple[rank]
): Index of the element to retrieve.
Returns:
The value of the element.
__setitem__
__setitem__(self: Self, idx: StaticIntTuple[rank], val: SIMD[type, 1])
Stores a single value into the buffer at the specified index.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, 1]
): The value to store.
__len__
__len__(self: Self) -> Int
Computes the NDBuffer’s number of elements.
Returns:
The total number of elements in the NDBuffer.
aligned_simd_load
aligned_simd_load[width: Int, alignment: Int](self: Self, *idx: Int) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
*Int
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
aligned_simd_load[width: Int, alignment: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
VariadicList[Int]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticIntTuple[rank]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticTuple[rank, Int]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
aligned_simd_store
aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticTuple[rank, Int]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
aligned_stack_allocation
aligned_stack_allocation[alignment: Int]() -> Self
Constructs an NDBuffer instance backed by stack allocated memory space.
Parameters:
- alignment (
Int
): Address alignment requirement for the allocation.
Returns:
Constructed NDBuffer with the allocated space.
bytecount
bytecount(self: Self) -> Int
Return the size of the NDBuffer in bytes.
Returns:
The size of the NDBuffer in bytes.
dim
dim[index: Int](self: Self) -> Int
Get the buffer dimension at the given index.
Parameters:
- index (
Int
): The number of dimension to get.
Returns:
The buffer size at the given dimension.
dim(self: Self, index: Int) -> Int
Get the buffer dimension at the given index.
Args:
- index (
Int
): The number of dimension to get.
Returns:
The buffer size at the given dimension.
fill
fill(self: Self, val: SIMD[type, 1])
Assigns val to all elements in the Buffer.
The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.
Args:
- val (
SIMD[type, 1]
): The value to store.
flatten
flatten(self: Self) -> Buffer[#pop.variant<:i1 0>, type]
Construct a flattened Buffer counterpart for this NDBuffer.
Constraints:
The buffer must be contiguous.
Returns:
Constructed Buffer object.
get_nd_index
get_nd_index(self: Self, idx: Int) -> StaticIntTuple[rank]
Computes the NDBuffer’s ND-index based on the flat index.
Args:
- idx (
Int
): The flat index.
Returns:
The index positions.
get_rank
get_rank(self: Self) -> Int
Returns the rank of the buffer.
Returns:
The rank of NDBuffer.
get_shape
get_shape(self: Self) -> StaticIntTuple[rank]
Returns the shapes of the buffer.
Returns:
A static tuple of size ‘rank’ representing shapes of the NDBuffer.
num_elements
num_elements(self: Self) -> Int
Computes the NDBuffer’s number of elements.
Returns:
The total number of elements in the NDBuffer.
prefetch
prefetch[params: PrefetchOptions](self: Self, *idx: Int)
Prefetch the data at the given index.
Parameters:
- params (
PrefetchOptions
): The prefetch configuration.
Args:
- idx (
*Int
): The N-D index of the prefetched location.
simd_fill
simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])
Assigns val to all elements in chunks of size simd_width.
Parameters:
- simd_width (
Int
): The simd_width of the fill.
Args:
- val (
SIMD[type, 1]
): The value to store.
simd_load
simd_load[width: Int](self: Self, *idx: Int) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load.
Args:
- idx (
*Int
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
simd_load[width: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load.
Args:
- idx (
VariadicList[Int]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
simd_load[width: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load.
Args:
- idx (
StaticIntTuple[rank]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
simd_load[width: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load.
Args:
- idx (
StaticTuple[rank, Int]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
simd_nt_store
simd_nt_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])
Stores a simd value using non-temporal store.
Constraints:
The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_nt_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])
Stores a simd value using non-temporal store.
Constraints:
The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticTuple[rank, Int]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_store
simd_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticTuple[rank, Int]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
size
size(self: Self) -> Int
Computes the NDBuffer’s number of elements.
Returns:
The total number of elements in the NDBuffer.
stack_allocation
stack_allocation() -> Self
Constructs an NDBuffer instance backed by stack allocated memory space.
Returns:
Constructed NDBuffer with the allocated space.
stride
stride(self: Self, index: Int) -> Int
Get the buffer stride at the given index.
Args:
- index (
Int
): The number of dimension to get the stride for.
Returns:
The stride at the given dimension.
zero
zero(self: Self)
Set all bytes of the NDBuffer to 0.
Constraints:
The buffer must be contiguous.
partial_simd_load
partial_simd_load[width: Int, type: DType](storage: DTypePointer[type], lbound: Int, rbound: Int, pad_value: SIMD[type, 1]) -> SIMD[type, width]
Loads a vector with dynamic bound.
Out of bound data will be filled with pad value. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1).
e.g. addr 0 1 2 3 data x 42 43 x
partial_simd_load4 #gives [0 42 43 0]
Parameters:
- width (
Int
): The system simd vector size. - type (
DType
): The underlying dtype of computation.
Args:
- storage (
DTypePointer[type]
): Pointer to the address to perform load. - lbound (
Int
): Lower bound of valid index within simd (inclusive). - rbound (
Int
): Upper bound of valid index within simd (non-inclusive). - pad_value (
SIMD[type, 1]
): Value to fill for out of bound indices.
Returns:
The SIMD vector loaded and zero-filled.
partial_simd_store
partial_simd_store[width: Int, type: DType](storage: DTypePointer[type], lbound: Int, rbound: Int, data: SIMD[type, width])
Stores a vector with dynamic bound.
Out of bound data will ignored. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1).
e.g. addr 0 1 2 3 data 0 0 0 0
partial_simd_load[4](addr0,1,3, [-1, 42,43, -1]) #gives [0 42 43 0]
Parameters:
- width (
Int
): The system simd vector size. - type (
DType
): The underlying dtype of computation.
Args:
- storage (
DTypePointer[type]
): Pointer to the address to perform load. - lbound (
Int
): Lower bound of valid index within simd (inclusive). - rbound (
Int
): Upper bound of valid index within simd (non-inclusive). - data (
SIMD[type, width]
): The vector value to store.
prod_dims
prod_dims[start_dim: Int, end_dim: Int, rank: Int, shape: DimList, type: DType](x: NDBuffer[rank, shape, type]) -> Int
Compute the product of a slice of the given buffer’s dimensions.
Parameters:
- start_dim (
Int
): The index at which to begin computing the product. - end_dim (
Int
): The index at which to stop computing the product. - rank (
Int
): The rank of the NDBuffer. - shape (
DimList
): The shape of the NDBuffer. - type (
DType
): The element-type of the NDBuffer.
Args:
- x (
NDBuffer[rank, shape, type]
): The NDBuffer whose dimensions will be multiplied.
Returns:
The product of the specified slice of the buffer’s dimensions.