buffer

Module

Implements the Buffer class.

You can import these APIs from the memory package. For example:

from memory.buffer import Buffer

Buffer

Defines a Buffer which can be parametrized on a static size and Dtype.

The Buffer does not own its underlying pointer.

Parameters:

  • size (Dim): The static size (if known) of the Buffer.
  • type (DType): The element type of the Buffer.

Fields:

  • data (DTypePointer[type]): The underlying data pointer of the data.
  • dynamic_size (Int): The dynamic size of the buffer.
  • dtype (DType): The dynamic data type of the buffer.

Functions:

__init__

__init__() -> Self

Default initializer for Buffer. By default the fields are all initialized to 0.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]) -> Self

Constructs a Buffer with statically known size and type.

Constraints:

The size is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]): Pointer to the data.

Returns:

The buffer object.

__init__(ptr: DTypePointer[type]) -> Self

Constructs a Buffer with statically known size and type.

Constraints:

The size is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.

Returns:

The buffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>], in_size: Int) -> Self

Constructs a Buffer with statically known type.

Constraints:

The size is unknown.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]): Pointer to the data.
  • in_size (Int): Dynamic size of the buffer.

Returns:

The buffer object.

__init__(ptr: DTypePointer[type], in_size: Int) -> Self

Constructs a Buffer with statically known type.

Constraints:

The size is unknown.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • in_size (Int): Dynamic size of the buffer.

Returns:

The buffer object.

__init__(data: DTypePointer[type], dynamic_size: Int, dtype: DType) -> Self

__copyinit__

__copyinit__(existing: Self) -> Self

__getitem__

__getitem__(self: Self, idx: Int) -> SIMD[type, 1]

Loads a single element (SIMD of size 1) from the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The value at the idx position.

__setitem__

__setitem__(self: Self, idx: Int, val: scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>)

Stores a single value into the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.
  • val (scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>): The value to store.

__setitem__(self: Self, idx: Int, val: SIMD[type, 1])

Stores a single value into the buffer at the specified index.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, 1]): The value to store.

__len__

__len__(self: Self) -> Int

Gets the size if it is a known constant, otherwise it gets the dynamic_size.

This method is used by Buffer.__len__ to get the size of the buffer. If the Buffer size is a known constant, then the size is returned. Otherwise, the dynamic_size is returned.

Returns:

The size if static otherwise dynamic_size.

simd_load

simd_load[width: Int](self: Self, idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (Int): The index into the Buffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_store

simd_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_simd_store

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_nt_store

simd_nt_store[width: Int](self: Self, idx: Int, val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (Int): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

prefetch

prefetch[params: PrefetchOptions](self: Self, idx: Int)

Prefetches the data at the given index.

Parameters:

  • params (PrefetchOptions): The prefetch configuration.

Args:

  • idx (Int): The index of the prefetched location.

bytecount

bytecount(self: Self) -> Int

Returns the size of the Buffer in bytes.

Returns:

The size of the Buffer in bytes.

zero

zero(self: Self)

Sets all bytes of the Buffer to 0.

simd_fill

simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])

Assigns val to all elements in chunks of size simd_width.

Parameters:

  • simd_width (Int): The simd_width of the fill.

Args:

  • val (SIMD[type, 1]): The value to store.

fill

fill(self: Self, val: SIMD[type, 1])

Assigns val to all elements in the Buffer.

The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.

Args:

  • val (SIMD[type, 1]): The value to store.

aligned_stack_allocation

aligned_stack_allocation[alignment: Int]() -> Self

Constructs a buffer instance backed by stack allocated memory space.

Parameters:

  • alignment (Int): Address alignment requirement for the allocation.

Returns:

Constructed buffer with the allocated space.

stack_allocation

stack_allocation() -> Self

Constructs a buffer instance backed by stack allocated memory space.

Returns:

Constructed buffer with the allocated space.

NDBuffer

An N-dimensional Buffer.

NDBuffer can be parametrized on rank, static dimensions and Dtype. It does not own its underlying pointer.

Parameters:

  • rank (Int): The rank of the buffer.
  • shape (DimList): The static size (if known) of the buffer.
  • type (DType): The element type of the buffer.

Fields:

  • data (DTypePointer[type]): The underlying data for the buffer. The pointer is not owned by the NDBuffer.
  • dynamic_shape (StaticIntTuple[rank]): The dynamic value of the shape.
  • dynamic_stride (StaticIntTuple[rank]): The dynamic stride of the buffer.
  • is_contiguous (Bool): True if the contents of the buffer are contiguous in memory.

Functions:

__init__

__init__() -> Self

Default initializer for NDBuffer. By default the fields are all initialized to 0.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]) -> Self

Constructs an NDBuffer with statically known rank, shapes and type.

Constraints:

The rank, shapes, and type are known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]): Pointer to the data.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type]) -> Self

Constructs an NDBuffer with statically known rank, shapes and type.

Constraints:

The rank, shapes, and type are known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.

Returns:

The NDBuffer object.

__init__(ptr: pointer<scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>>, dynamic_shape: StaticIntTuple[rank]) -> Self

Constructs an NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (pointer<scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>>): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>], dynamic_shape: StaticIntTuple[rank]) -> Self

Constructs an NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank]) -> Self

Constructs an NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.

Returns:

The NDBuffer object.

__init__(ptr: Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>], dynamic_shape: StaticIntTuple[rank], dynamic_stride: StaticIntTuple[rank]) -> Self

Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (Pointer[scalar<#lit.struct.extract<:!kgen.declref<_"$builtin"::_"$dtype"::_DType> type, "value">>]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_stride (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing strides.

Returns:

The NDBuffer object.

__init__(ptr: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_stride: StaticIntTuple[rank]) -> Self

Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.

Constraints:

The rank is known.

Args:

  • ptr (DTypePointer[type]): Pointer to the data.
  • dynamic_shape (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing shapes.
  • dynamic_stride (StaticIntTuple[rank]): A static tuple of size ‘rank’ representing strides.

Returns:

The NDBuffer object.

__init__(data: DTypePointer[type], dynamic_shape: StaticIntTuple[rank], dynamic_stride: StaticIntTuple[rank], is_contiguous: Bool) -> Self

__getitem__

__getitem__(self: Self, *idx: Int) -> SIMD[type, 1]

Gets an element from the buffer from the specified index.

Args:

  • idx (*Int): Index of the element to retrieve.

Returns:

The value of the element.

__getitem__(self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, 1]

Gets an element from the buffer from the specified index.

Args:

  • idx (StaticIntTuple[rank]): Index of the element to retrieve.

Returns:

The value of the element.

__setitem__

__setitem__(self: Self, idx: StaticIntTuple[rank], val: SIMD[type, 1])

Stores a single value into the buffer at the specified index.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, 1]): The value to store.

get_rank

get_rank(self: Self) -> Int

Returns the rank of the buffer.

Returns:

The rank of NDBuffer.

get_shape

get_shape(self: Self) -> StaticIntTuple[rank]

Returns the shapes of the buffer.

Returns:

A static tuple of size ‘rank’ representing shapes of the NDBuffer.

get_nd_index

get_nd_index(self: Self, idx: Int) -> StaticIntTuple[rank]

Computes the NDBuffer’s ND-index based on the flat index.

Args:

  • idx (Int): The flat index.

Returns:

The index positions.

__len__

__len__(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

num_elements

num_elements(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

size

size(self: Self) -> Int

Computes the NDBuffer’s number of elements.

Returns:

The total number of elements in the NDBuffer.

simd_load

simd_load[width: Int](self: Self, *idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (*Int): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (VariadicList[Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (StaticIntTuple[rank]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_load[width: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.

Args:

  • idx (StaticTuple[rank, Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load

aligned_simd_load[width: Int, alignment: Int](self: Self, *idx: Int) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (*Int): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: VariadicList[Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (VariadicList[Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticIntTuple[rank]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

aligned_simd_load[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int]) -> SIMD[type, width]

Loads a simd value from the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The simd_width of the load.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticTuple[rank, Int]): The index into the NDBuffer.

Returns:

The simd value starting at the idx position and ending at idx+width.

simd_store

simd_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_simd_store

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

aligned_simd_store[width: Int, alignment: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value into the buffer at the specified index.

Constraints:

The buffer must be contiguous or width must be 1.

Parameters:

  • width (Int): The width of the simd vector.
  • alignment (Int): The alignment value.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_nt_store

simd_nt_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticIntTuple[rank]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

simd_nt_store[width: Int](self: Self, idx: StaticTuple[rank, Int], val: SIMD[type, width])

Stores a simd value using non-temporal store.

Constraints:

The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.

Parameters:

  • width (Int): The width of the simd vector.

Args:

  • idx (StaticTuple[rank, Int]): The index into the Buffer.
  • val (SIMD[type, width]): The value to store.

dim

dim[index: Int](self: Self) -> Int

Gets the buffer dimension at the given index.

Parameters:

  • index (Int): The number of dimension to get.

Returns:

The buffer size at the given dimension.

dim(self: Self, index: Int) -> Int

Gets the buffer dimension at the given index.

Args:

  • index (Int): The number of dimension to get.

Returns:

The buffer size at the given dimension.

stride

stride(self: Self, index: Int) -> Int

Gets the buffer stride at the given index.

Args:

  • index (Int): The number of dimension to get the stride for.

Returns:

The stride at the given dimension.

flatten

flatten(self: Self) -> Buffer[#pop.variant<:i1 0>, type]

Constructs a flattened Buffer counterpart for this NDBuffer.

Constraints:

The buffer must be contiguous.

Returns:

Constructed Buffer object.

make_dims_unknown

make_dims_unknown(self: Self) -> NDBuffer[rank, create_unknown[$builtin::$int::Int][rank](), type]

Rebinds the NDBuffer to one with unknown shape.

Returns:

The rebound NDBuffer with unknown shape.

bytecount

bytecount(self: Self) -> Int

Returns the size of the NDBuffer in bytes.

Returns:

The size of the NDBuffer in bytes.

zero

zero(self: Self)

Sets all bytes of the NDBuffer to 0.

Constraints:

The buffer must be contiguous.

simd_fill

simd_fill[simd_width: Int](self: Self, val: SIMD[type, 1])

Assigns val to all elements in chunks of size simd_width.

Parameters:

  • simd_width (Int): The simd_width of the fill.

Args:

  • val (SIMD[type, 1]): The value to store.

fill

fill(self: Self, val: SIMD[type, 1])

Assigns val to all elements in the Buffer.

The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.

Args:

  • val (SIMD[type, 1]): The value to store.

aligned_stack_allocation

aligned_stack_allocation[alignment: Int]() -> Self

Constructs an NDBuffer instance backed by stack allocated memory space.

Parameters:

  • alignment (Int): Address alignment requirement for the allocation.

Returns:

Constructed NDBuffer with the allocated space.

stack_allocation

stack_allocation() -> Self

Constructs an NDBuffer instance backed by stack allocated memory space.

Returns:

Constructed NDBuffer with the allocated space.

prefetch

prefetch[params: PrefetchOptions](self: Self, *idx: Int)

Prefetches the data at the given index.

Parameters:

  • params (PrefetchOptions): The prefetch configuration.

Args:

  • idx (*Int): The N-D index of the prefetched location.

prefetch[params: PrefetchOptions](self: Self, indices: StaticIntTuple[rank])

Prefetches the data at the given index.

Parameters:

  • params (PrefetchOptions): The prefetch configuration.

Args:

  • indices (StaticIntTuple[rank]): The N-D index of the prefetched location.

DynamicRankBuffer

DynamicRankBuffer represents a buffer with unknown rank, shapes and dtype.

It is not as efficient as the statically ranked buffer, but is useful when interacting with external functions. In particular the shape is represented as a fixed (ie _MAX_RANK) array of dimensions to simplify the ABI.

Fields:

  • data (DTypePointer[invalid]): The pointer to the buffer.
  • rank (Int): The buffer rank. Has a max value of _MAX_RANK.
  • shape (StaticIntTuple[8]): The dynamic shape of the buffer.
  • type (DType): The dynamic dtype of the buffer.

Functions:

__init__

__init__(data: DTypePointer[invalid], rank: Int, shape: StaticIntTuple[8], type: DType) -> Self

Construct DynamicRankBuffer.

Args:

  • data (DTypePointer[invalid]): Pointer to the underlying data.
  • rank (Int): Rank of the buffer.
  • shape (StaticIntTuple[8]): Shapes of the buffer.
  • type (DType): dtype of the buffer.

Returns:

Constructed DynamicRankBuffer.

to_buffer

to_buffer[type: DType](self: Self) -> Buffer[#pop.variant<:i1 0>, type]

Casts DynamicRankBuffer to Buffer.

Parameters:

  • type (DType): dtype of the buffer.

Returns:

Constructed Buffer.

to_ndbuffer

to_ndbuffer[rank: Int, type: DType](self: Self) -> NDBuffer[rank, create_unknown[$builtin::$int::Int][rank](), type]

Casts the buffer to NDBuffer.

Constraints:

Rank of DynamicRankBuffer must equal rank of NDBuffer.

Parameters:

  • rank (Int): Rank of the buffer.
  • type (DType): dtype of the buffer.

Returns:

Constructed NDBuffer.

to_ndbuffer[rank: Int, type: DType](self: Self, stride: StaticIntTuple[rank]) -> NDBuffer[rank, create_unknown[$builtin::$int::Int][rank](), type]

Casts the buffer to NDBuffer.

Constraints:

Rank of DynamicRankBuffer must equal rank of NDBuffer.

Parameters:

  • rank (Int): Rank of the buffer.
  • type (DType): dtype of the buffer.

Args:

  • stride (StaticIntTuple[rank]): Strides of the buffer.

Returns:

Constructed NDBuffer.

rank_dispatch

rank_dispatch[func: fn[Int]() capturing -> None](self: Self)

Dispatches the function call based on buffer rank.

Constraints:

Rank must be positive and less or equal to 8.

Parameters:

  • func (fn[Int]() capturing -> None): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.

rank_dispatch[func: fn[Int]() capturing -> None](self: Self, out_chain: OutputChainPtr)

Dispatches the function call based on buffer rank.

Constraints:

Rank must be positive and less or equal to 8.

Parameters:

  • func (fn[Int]() capturing -> None): Function to dispatch. The function should be parametrized on an index parameter, which will be used for rank when the function will be called.

Args:

  • out_chain (OutputChainPtr): The output chain.

num_elements

num_elements(self: Self) -> Int

Gets number of elements in the buffer.

Returns:

The number of elements in the buffer.

get_shape

get_shape[rank: Int](self: Self) -> StaticIntTuple[rank]

Gets a static tuple representing the buffer shape.

Parameters:

  • rank (Int): Rank of the buffer.

Returns:

A static tuple of size ‘Rank’ filled with buffer shapes.

dim

dim(self: Self, idx: Int) -> Int

Gets given dimension.

Args:

  • idx (Int): The dimension index.

Returns:

The buffer size on the given dimension.

partial_simd_load

partial_simd_load[type: DType, width: Int](storage: DTypePointer[type], lbound: Int, rbound: Int, pad_value: SIMD[type, 1]) -> SIMD[type, width]

Loads a vector with dynamic bound.

Out of bound data will be filled with pad value. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1). For example:

addr 0  1  2  3
data x 42 43  x

partial_simd_load[4](addr0,1,3) #gives [0 42 43 0]

Parameters:

  • type (DType): The underlying dtype of computation.
  • width (Int): The system simd vector size.

Args:

  • storage (DTypePointer[type]): Pointer to the address to perform load.
  • lbound (Int): Lower bound of valid index within simd (inclusive).
  • rbound (Int): Upper bound of valid index within simd (non-inclusive).
  • pad_value (SIMD[type, 1]): Value to fill for out of bound indices.

Returns:

The SIMD vector loaded and zero-filled.

partial_simd_store

partial_simd_store[type: DType, width: Int](storage: DTypePointer[type], lbound: Int, rbound: Int, data: SIMD[type, width])

Stores a vector with dynamic bound.

Out of bound data will ignored. Data is valid if lbound <= idx < rbound for idx from 0 to (simd_width-1).

e.g. addr 0 1 2 3 data 0 0 0 0

partial_simd_load[4](addr0,1,3, [-1, 42,43, -1]) #gives [0 42 43 0]

Parameters:

  • type (DType): The underlying dtype of computation.
  • width (Int): The system simd vector size.

Args:

  • storage (DTypePointer[type]): Pointer to the address to perform load.
  • lbound (Int): Lower bound of valid index within simd (inclusive).
  • rbound (Int): Upper bound of valid index within simd (non-inclusive).
  • data (SIMD[type, width]): The vector value to store.

prod_dims

prod_dims[start_dim: Int, end_dim: Int, rank: Int, shape: DimList, type: DType](x: NDBuffer[rank, shape, type]) -> Int

Computes the product of a slice of the given buffer’s dimensions.

Parameters:

  • start_dim (Int): The index at which to begin computing the product.
  • end_dim (Int): The index at which to stop computing the product.
  • rank (Int): The rank of the NDBuffer.
  • shape (DimList): The shape of the NDBuffer.
  • type (DType): The element-type of the NDBuffer.

Args:

  • x (NDBuffer[rank, shape, type]): The NDBuffer whose dimensions will be multiplied.

Returns:

The product of the specified slice of the buffer’s dimensions.