Buffer

Module

Implements the Buffer class.

Buffer

Defines a Buffer which can be parametrized on a static size and Dtype.

The Buffer does not own its underlying pointer.

Parameters:

  • size (Dim): The static size (if known) of the Buffer.
  • type (DType): The element type of the Buffer.

Fields:

data

The underlying data pointer of the data.

dtype

The dynamic data type of the buffer.

dynamic_size

The dynamic size of the buffer.

Functions:

__init__

__init__() -> Self

Default initializer for Buffer. By default the fields are all inialized to 0.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]) -> Self

Constructor for a Buffer with statically known size and type.

Constraints:

The size is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]): Pointer to the data.

Returns:

The buffer object.

__init__(ptr: DTypePointer[type]) -> Self

Constructor for a Buffer with statically known size and type.

Constraints:

The size is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.

Returns:

The buffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], in_size: Int) -> Self

Constructor for a Buffer with statically known type.

Constraints:

The size is unknown.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]): Pointer to the data.
  • in_size (Int): Dynamic size of the buffer.

Returns:

The buffer object.

__init__(ptr: DTypePointer[type], in_size: Int) -> Self

Constructor for a Buffer with statically known type.

Constraints:

The size is unknown.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • in_size (Int): Dynamic size of the buffer.

Returns:

The buffer object.

__init__(data: DTypePointer[type], dynamic_size: Int, dtype: DType) -> Self

__copyinit__

__copyinit__(existing: Self) -> Self

__getitem__

__getitem__(self: Self, idx: Int) -> SIMD[type, 1]

Loads a single element (SIMD of size 1) from the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The value at the idx position.

__setitem__

__setitem__(self: Self, idx: Int, val: scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>)

Stores a single value into the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.
  • val (scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>): The value to store.

__setitem__(self: Self, idx: Int, val: SIMD[type, 1])

Stores a single value into the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, 1]): The value to store.

__len__

__len__(self: Self) -> Int

Gets the size if it is a known constant, otherwise it gets the dynamic_size.

This method is used by Buffer.__len__ to get the size of the buffer. If the Buffer size is a known constant, then the size is returned. Otherwise, the dynamic_size is returned.

Returns:

The size if static otherwise dynamic_size.

aligned_simd_load

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_store

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_stack_allocation

aligned_stack_allocation[alignment: Int]() -> Self

Constructs a buffer instance backed by stack allocated memory space.

Parameters:

  • alignment (Int): Address alignment requirement for the allocation.

Returns:

Constructed buffer with the allocated space.

bytecount

bytecount(self: Self) -> Int

Return the size of the Buffer in bytes.

Returns:

The size of the Buffer in bytes.

fill

fill(self: Self, val: SIMD[type, 1])

Assigns val to all elements in the Buffer.

The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.

Args:

  • val (SIMD[type, 1]): The value to store.

prefetch

prefetch[params: PrefetchOptions](self: Self, idx: Int)

Prefetch the data at the given index.

Parameters:

  • params (PrefetchOptions): The prefetch configuration.

Args:

  • idx (Int): The index of the prefetched location.

simd_fill

simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])

Assigns val to all elements in chunks of size simd_width.

Parameters:

  • simd_width (Int): The simd_width of the fill.

Args:

  • val (SIMD[type, 1]): The value to store.

simd_load

simd_load[width: Int](self: Self, idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_nt_store

simd_nt_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_store

simd_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

stack_allocation

stack_allocation() -> Self

Constructs a buffer instance backed by stack allocated memory space.

Returns:

Constructed buffer with the allocated space.

zero

zero(self: Self)

Set all bytes of the Buffer to 0.

DynamicRankBuffer

DynamicRankBuffer represents a buffer with unknown rank, shapes and dtype.

It is not as efficient as the statically ranked buffer, but is useful when interacting with external functions. In particular the shape is represented as a fixed (ie _MAX_RANK) array of dimensions to simplify the ABI.

Fields:

data

The pointer to the buffer.

rank

The buffer rank. Has a max value of _MAX_RANK.

shape

The dynamic shape of the buffer.

type

The dynamic dtype of the buffer.

Functions:

__init__

__init__(data: DTypePointer[invalid], rank: Int, shape: StaticIntTuple[8], type: DType) -> Self

Construct DynamicRankBuffer.

Args:

  • data (DTypePointer[invalid]): Pointer to the underlying data.
  • rank (Int): Rank of the buffer.
  • shape (StaticIntTuple[8]): Shapes of the buffer.
  • type (DType): dtype of the buffer.

Returns:

Constructed DynamicRankBuffer.

dim

dim(self: Self, idx: Int) -> Int

Get given dimension.

Args:

  • idx (Int): The dimension index.

Returns:

The buffer size on the given dimension.

get_shape

get_shape[rank: Int](self: Self) -> StaticIntTuple[rank]

Get a static tuple representing the buffer shape.

Parameters:

  • rank (Int): Rank of the buffer.

Returns:

A static tuple of size ‘Rank’ filled with buffer shapes.

num_elements

num_elements(self: Self) -> Int

Get number of elements in the buffer.

Returns:

The number of elements in the buffer.

rank_dispatch

rank_dispatch[func: fn[Int]() capturing -> None](self: Self)

Dispatch the function call based on buffer rank.

Constraints:

Rank must be positive and less or equal to 8.

Parameters:

  • func (fn[Int]() capturing -> None): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.

rank_dispatch[func: fn[Int]() capturing -> None](self: Self, out_chain: OutputChainPtr)

Dispatch the function call based on buffer rank.

Constraints:

Rank must be positive and less or equal to 8.

Parameters:

  • func (fn[Int]() capturing -> None): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.

Args:

  • out_chain (OutputChainPtr): The output chain.

to_buffer

to_buffer[type: DType](self: Self) -> Buffer[#pop.variant<:i1 0>, type]

Cast DynamicRankBuffer to Buffer.

Parameters:

  • type (DType): dtype of the buffer.

Returns:

Constructed Buffer.

to_ndbuffer

to_ndbuffer[rank: Int, type: DType](self: Self) -> NDBuffer[rank, create_unknown[rank](), type]

Cast the buffer to NDBuffer.

Constraints:

Rank of DynamicRankBuffer must equal rank of NDBuffer.

Parameters:

  • rank (Int): Rank of the buffer.
  • type (DType): dtype of the buffer.

Returns:

Constructed NDBuffer.

to_ndbuffer[rank: Int, type: DType](self: Self, stride: StaticIntTuple[rank]) -> NDBuffer[rank, create_unknown[rank](), type]

Cast the buffer to NDBuffer.

Constraints:

Rank of DynamicRankBuffer must equal rank of NDBuffer.

Parameters:

  • rank (Int): Rank of the buffer.
  • type (DType): dtype of the buffer.

Args:

  • stride (StaticIntTuple[rank]): Strides of the buffer.

Returns:

Constructed NDBuffer.

NDBuffer

An N-dimensional Buffer.

NDBuffer can be parametrized on rank, static dimensions and Dtype. It does not own its underlying pointer.

Parameters:

  • rank (Int): The rank of the buffer.
  • shape (DimList): The static size (if known) of the buffer.
  • type (DType): The element type of the buffer.

Fields:

data

The underlying data for the buffer. The pointer is not owned by the NDBuffer.

dynamic_dtype

The dynamic dtype.

dynamic_shape

The dynamic value of the shape.

dynamic_stride

The dynamic stride of the buffer.

is_contiguous

True if the contents of the buffer are contiguous in memory.

Functions:

__init__

__init__() -> Self

Default initializer for NDBuffer. By default the fields are all inialized to 0.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]) -> Self

Constructor for NDBuffer with statically known rank, shapes and type.

Constraints:

The rank, shapes, and type are known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]): Pointer to the data.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type]) -> Self

Constructor for NDBuffer with statically known rank, shapes and type.

Constraints:

The rank, shapes, and type are known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.

Returns:

The NDBuffer object.

__init__(ptr: pointer<scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>>, dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self

Constructor for NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (pointer<scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>>): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_dtype (DType): Dtype for the buffer.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self

Constructor for NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_dtype (DType): Dtype for the buffer.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType) -> Self

Constructor for NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_dtype (DType): Dtype for the buffer.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank]) -> Self

Constructor for strided NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:_"$DType"::_DType type, "value">>]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_dtype (DType): Dtype for the buffer.
  • dynamic_stride (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing strides.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank]) -> Self

Constructor for strided NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_dtype (DType): Dtype for the buffer.
  • dynamic_stride (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing strides.

Returns:

The NDBuffer object.

__init__(data: DTypePointer[type], _rank: Int, dynamic_shape: StaticIntTuple[rank], dynamic_dtype: DType, dynamic_stride: StaticIntTuple[rank], is_contiguous: Bool) -> Self

__getitem__

__getitem__(self: Self, *idx: Int) -> SIMD[type, 1]

Get an element from the buffer from the specified index.

Args:

  • idx (*Int): Index of the element to retrieve.

Returns:

The value of the element.

__getitem__(self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, 1]

Get an element from the buffer from the specified index.

Args:

  • idx (StaticIntTuple[rank]): Index of the element to retrieve.

Returns:

The value of the element.

__setitem__

__setitem__(self: Self, idx: StaticIntTuple[rank], val: SIMD[type, 1])

Stores a single value into the buffer at the specified index.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, 1]): The value to store.

__len__

__len__(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

aligned_simd_load

aligned_simd_load[width: Int, alignment: Int](self: Self, *idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (*Int): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (VariadicList[Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticIntTuple[rank]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticTuple[rank, Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_store

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_stack_allocation

aligned_stack_allocation[alignment: Int]() -> Self

Constructs an NDBuffer instance backed by stack allocated memory space.

Parameters:

  • alignment (Int): Address alignment requirement for the allocation.

Returns:

Constructed NDBuffer with the allocated space.

bytecount

bytecount(self: Self) -> Int

Return the size of the NDBuffer in bytes.

Returns:

The size of the NDBuffer in bytes.

dim

dim[index: Int](self: Self) -> Int

Get the buffer dimension at the given index.

Parameters:

  • index (Int): The number of dimension to get.

Returns:

The buffer size at the given dimension.

dim(self: Self, index: Int) -> Int

Get the buffer dimension at the given index.

Args:

  • index (Int): The number of dimension to get.

Returns:

The buffer size at the given dimension.

fill

fill(self: Self, val: SIMD[type, 1])

Assigns val to all elements in the Buffer.

The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.

Args:

  • val (SIMD[type, 1]): The value to store.

flatten

flatten(self: Self) -> Buffer[#pop.variant<:i1 0>, type]

Construct a flattened Buffer counterpart for this NDBuffer.

Constraints:

The buffer must be contiguous.

Returns:

Constructed Buffer object.

get_nd_index

get_nd_index(self: Self, idx: Int) -> StaticIntTuple[rank]

Computes the NDBuffer’s ND-index based on the flat index.

Args:

  • idx (Int): The flat index.

Returns:

The index positions.

get_rank

get_rank(self: Self) -> Int

Returns the rank of the buffer.

Returns:

The rank of NDBuffer.

get_shape

get_shape(self: Self) -> StaticIntTuple[rank]

Returns the shapes of the buffer.

Returns:

A static tuple of size ‘rank’ representing shapes of the NDBuffer.

num_elements

num_elements(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

prefetch

prefetch[params: PrefetchOptions](self: Self, *idx: Int)

Prefetch the data at the given index.

Parameters:

  • params (PrefetchOptions): The prefetch configuration.

Args:

  • idx (*Int): The N-D index of the prefetched location.

simd_fill

simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])

Assigns val to all elements in chunks of size simd_width.

Parameters:

  • simd_width (Int): The simd_width of the fill.

Args:

  • val (SIMD[type, 1]): The value to store.

simd_load

simd_load[width: Int](self: Self, *idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (*Int): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (VariadicList[Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (StaticIntTuple[rank]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (StaticTuple[rank, Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_nt_store

simd_nt_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_nt_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_store

simd_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

size

size(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

stack_allocation

stack_allocation() -> Self

Constructs an NDBuffer instance backed by stack allocated memory space.

Returns:

Constructed NDBuffer with the allocated space.

stride

stride(self: Self, index: Int) -> Int

Get the buffer stride at the given index.

Args:

  • index (Int): The number of dimension to get the stride for.

Returns:

The stride at the given dimension.

zero

zero(self: Self)

Set all bytes of the NDBuffer to 0.

Constraints:

The buffer must be contiguous.

partial_simd_load

partial_simd_load[width: Int, type: DType](storage: DTypePointer[type], lbound: Int, rbound: Int, pad_value: SIMD[type, 1]) -> SIMD[type, width]

Loads a vector with dynamic bound.

Out of bound data will be filled with pad value. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1).

e.g. addr 0 1 2 3 data x 42 43 x

partial_simd_load4 #gives [0 42 43 0]

Parameters:

  • width (Int): The system simd vector size.
  • type (DType): The underlying dtype of computation.

Args:

  • storage (DTypePointer[type]): Pointer to the address to perform load.
  • lbound (Int): Lower bound of valid index within simd (inclusive).
  • rbound (Int): Upper bound of valid index within simd (non-inclusive).
  • pad_value (SIMD[type, 1]): Value to fill for out of bound indices.

Returns:

The SIMD vector loaded and zero-filled.

partial_simd_store

partial_simd_store[width: Int, type: DType](storage: DTypePointer[type], lbound: Int, rbound: Int, data: SIMD[type, width])

Stores a vector with dynamic bound.

Out of bound data will ignored. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1).

e.g. addr 0 1 2 3 data 0 0 0 0

partial_simd_load[4](addr0,1,3, [-1, 42,43, -1]) #gives [0 42 43 0]

Parameters:

  • width (Int): The system simd vector size.
  • type (DType): The underlying dtype of computation.

Args:

  • storage (DTypePointer[type]): Pointer to the address to perform load.
  • lbound (Int): Lower bound of valid index within simd (inclusive).
  • rbound (Int): Upper bound of valid index within simd (non-inclusive).
  • data (SIMD[type, width]): The vector value to store.

prod_dims

prod_dims[start_dim: Int, end_dim: Int, rank: Int, shape: DimList, type: DType](x: NDBuffer[rank, shape, type]) -> Int

Compute the product of a slice of the given buffer’s dimensions.

Parameters:

  • start_dim (Int): The index at which to begin computing the product.
  • end_dim (Int): The index at which to stop computing the product.
  • rank (Int): The rank of the NDBuffer.
  • shape (DimList): The shape of the NDBuffer.
  • type (DType): The element-type of the NDBuffer.

Args:

  • x (NDBuffer[rank, shape, type]): The NDBuffer whose dimensions will be multiplied.

Returns:

The product of the specified slice of the buffer’s dimensions.