struct
NDBuffer
An N-dimensional Buffer.
NDBuffer can be parametrized on rank, static dimensions and Dtype. It does not own its underlying pointer.
Parameters
- type (
DType
): The element type of the buffer. - rank (
Int
): The rank of the buffer. - shape (
DimList
): The static size (if known) of the buffer. - address_space (
AddressSpace
): The address space of the buffer.
Fields
- data (
DTypePointer[type, address_space]
): The underlying data for the buffer. The pointer is not owned by the NDBuffer. - dynamic_shape (
StaticIntTuple[rank]
): The dynamic value of the shape. - dynamic_stride (
StaticIntTuple[rank]
): The dynamic stride of the buffer. - is_contiguous (
Bool
): True if the contents of the buffer are contiguous in memory.
Implemented traits
AnyType
,
Copyable
,
Movable
,
Sized
,
Stringable
Methods
__init__
__init__(inout self: Self, /)
Default initializer for NDBuffer. By default the fields are all initialized to 0.
__init__(inout self: Self, /, ptr: LegacyPointer[SIMD[type, 1], address_space])
Constructs an NDBuffer with statically known rank, shapes and type.
Constraints:
The rank, shapes, and type are known.
Args:
- ptr (
LegacyPointer[SIMD[type, 1], address_space]
): Pointer to the data.
__init__(inout self: Self, /, ptr: DTypePointer[type, address_space])
Constructs an NDBuffer with statically known rank, shapes and type.
Constraints:
The rank, shapes, and type are known.
Args:
- ptr (
DTypePointer[type, address_space]
): Pointer to the data.
__init__(inout self: Self, /, ptr: LegacyPointer[scalar<#lit.struct.extract<:_stdlib::_builtin::_dtype::_DType type, "value">>, address_space], dynamic_shape: StaticIntTuple[rank])
Constructs an NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
LegacyPointer[scalar<#lit.struct.extract<:_stdlib::_builtin::_dtype::_DType type, "value">>, address_space]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing shapes.
__init__(inout self: Self, /, ptr: DTypePointer[type, address_space], dynamic_shape: StaticIntTuple[rank])
Constructs an NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type, address_space]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing shapes.
__init__(inout self: Self, /, ptr: DTypePointer[type, address_space], dynamic_shape: DimList)
Constructs an NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type, address_space]
): Pointer to the data. - dynamic_shape (
DimList
): A static tuple of size 'rank' representing shapes.
__init__(inout self: Self, /, ptr: LegacyPointer[SIMD[type, 1], address_space], dynamic_shape: StaticIntTuple[rank], dynamic_stride: StaticIntTuple[rank])
Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
LegacyPointer[SIMD[type, 1], address_space]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing shapes. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing strides.
__init__(inout self: Self, /, ptr: LegacyPointer[SIMD[type, 1], address_space], dynamic_shape: DimList, dynamic_stride: StaticIntTuple[rank])
Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
LegacyPointer[SIMD[type, 1], address_space]
): Pointer to the data. - dynamic_shape (
DimList
): A DimList of size 'rank' representing shapes. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing strides.
__init__(inout self: Self, /, ptr: DTypePointer[type, address_space], dynamic_shape: StaticIntTuple[rank], dynamic_stride: StaticIntTuple[rank])
Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type, address_space]
): Pointer to the data. - dynamic_shape (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing shapes. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing strides.
__init__(inout self: Self, /, ptr: DTypePointer[type, address_space], dynamic_shape: DimList, dynamic_stride: StaticIntTuple[rank])
Constructs a strided NDBuffer with statically known rank, but dynamic shapes and type.
Constraints:
The rank is known.
Args:
- ptr (
DTypePointer[type, address_space]
): Pointer to the data. - dynamic_shape (
DimList
): A DimList of size 'rank' representing shapes. - dynamic_stride (
StaticIntTuple[rank]
): A static tuple of size 'rank' representing strides.
__getitem__
__getitem__(self: Self, *idx: Int) -> SIMD[type, 1]
Gets an element from the buffer from the specified index.
Args:
- *idx (
Int
): Index of the element to retrieve.
Returns:
The value of the element.
__getitem__(self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, 1]
Gets an element from the buffer from the specified index.
Args:
- idx (
StaticIntTuple[rank]
): Index of the element to retrieve.
Returns:
The value of the element.
__setitem__
__setitem__(self: Self, idx: StaticIntTuple[rank], val: SIMD[type, 1])
Stores a single value into the buffer at the specified index.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, 1]
): The value to store.
__add__
__add__(self: Self, rhs: NDBuffer[type, rank, shape, address_space]) -> Self
Adds a NDBuffer.
Args:
- rhs (
NDBuffer[type, rank, shape, address_space]
): The RHS of the add operation.
Returns:
The addition result.
__sub__
__sub__(self: Self, rhs: Self) -> Self
Subtracts a scalar.
Args:
- rhs (
Self
): The RHS of the sub operation.
Returns:
The subtraction result.
__sub__[rhs_shape: DimList](self: Self, rhs: NDBuffer[type, 1, rhs_shape, 0]) -> Self
Subtracts a NDBuffer.
Parameters:
- rhs_shape (
DimList
): Shape of RHS.
Args:
- rhs (
NDBuffer[type, 1, rhs_shape, 0]
): The RHS of the sub operation.
Returns:
The subtraction result.
__mul__
__mul__(self: Self, rhs: Self) -> Self
Multiplies a NDBuffer.
Args:
- rhs (
Self
): The RHS of the mul operation.
Returns:
The division result.
__imul__
__imul__(inout self: Self, rhs: SIMD[float32, 1])
In-place multiplies a scalar.
Args:
- rhs (
SIMD[float32, 1]
): The RHS of the mul operation.
__imul__(inout self: Self, rhs: NDBuffer[type, rank, shape, address_space])
In-place multiplies a NDBuffer.
Args:
- rhs (
NDBuffer[type, rank, shape, address_space]
): The RHS of the mul operation.
__itruediv__
__itruediv__(inout self: Self, rhs: NDBuffer[type, rank, shape, address_space])
In-place divides a NDBuffer.
Args:
- rhs (
NDBuffer[type, rank, shape, address_space]
): The RHS of the div operation.
get_rank
get_rank(self: Self) -> Int
Returns the rank of the buffer.
Returns:
The rank of NDBuffer.
get_shape
get_shape(self: Self) -> StaticIntTuple[rank]
Returns the shapes of the buffer.
Returns:
A static tuple of size 'rank' representing shapes of the NDBuffer.
get_nd_index
get_nd_index(self: Self, idx: Int) -> StaticIntTuple[rank]
Computes the NDBuffer's ND-index based on the flat index.
Args:
- idx (
Int
): The flat index.
Returns:
The index positions.
__len__
__len__(self: Self) -> Int
Computes the NDBuffer's number of elements.
Returns:
The total number of elements in the NDBuffer.
num_elements
num_elements(self: Self) -> Int
Computes the NDBuffer's number of elements.
Returns:
The total number of elements in the NDBuffer.
size
size(self: Self) -> Int
Computes the NDBuffer's number of elements.
Returns:
The total number of elements in the NDBuffer.
__str__
__str__(self: Self) -> String
Gets the buffer as a string.
Returns:
A compact string of the buffer.
__repr__
__repr__(self: Self) -> String
Gets the buffer as a string.
Returns:
A compact string representation of the buffer.
tile
tile[*tile_sizes: Dim](self: Self, tile_coords: StaticIntTuple[rank]) -> NDBuffer[type, rank, $0, address_space]
Returns an n-d tile "slice" of the buffer of size tile_sizes at coords.
Parameters:
- *tile_sizes (
Dim
): The size of the tiles.
Args:
- tile_coords (
StaticIntTuple[rank]
): The tile index.
Returns:
The tiled buffer at tile_coords.
load
load[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, *idx: Int) -> SIMD[type, $0]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- *idx (
Int
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
load[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, idx: VariadicList[Int]) -> SIMD[type, $0]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
VariadicList[Int]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
load[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, idx: StaticIntTuple[rank]) -> SIMD[type, $0]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticIntTuple[rank]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
load[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, idx: StaticTuple[Int, rank]) -> SIMD[type, $0]
Loads a simd value from the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The simd_width of the load. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticTuple[Int, rank]
): The index into the NDBuffer.
Returns:
The simd value starting at the idx
position and ending at idx+width
.
store
store[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
store[*, width: Int = 1, alignment: Int = alignof[stdlib::builtin::dtype::DType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1](self: Self, idx: StaticTuple[Int, rank], val: SIMD[type, width])
Stores a simd value into the buffer at the specified index.
Constraints:
The buffer must be contiguous or width must be 1.
Parameters:
- width (
Int
): The width of the simd vector. - alignment (
Int
): The alignment value.
Args:
- idx (
StaticTuple[Int, rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_nt_store
simd_nt_store[width: Int](self: Self, idx: StaticIntTuple[rank], val: SIMD[type, width])
Stores a simd value using non-temporal store.
Constraints:
The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticIntTuple[rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
simd_nt_store[width: Int](self: Self, idx: StaticTuple[Int, rank], val: SIMD[type, width])
Stores a simd value using non-temporal store.
Constraints:
The buffer must be contiguous. The address must be properly aligned, 64B for avx512, 32B for avx2, and 16B for avx.
Parameters:
- width (
Int
): The width of the simd vector.
Args:
- idx (
StaticTuple[Int, rank]
): The index into the Buffer. - val (
SIMD[type, width]
): The value to store.
dim
dim[index: Int](self: Self) -> Int
Gets the buffer dimension at the given index.
Parameters:
- index (
Int
): The number of dimension to get.
Returns:
The buffer size at the given dimension.
dim(self: Self, index: Int) -> Int
Gets the buffer dimension at the given index.
Args:
- index (
Int
): The number of dimension to get.
Returns:
The buffer size at the given dimension.
stride
stride(self: Self, index: Int) -> Int
Gets the buffer stride at the given index.
Args:
- index (
Int
): The number of dimension to get the stride for.
Returns:
The stride at the given dimension.
flatten
flatten(self: Self) -> Buffer[type, Dim(), address_space]
Constructs a flattened Buffer counterpart for this NDBuffer.
Constraints:
The buffer must be contiguous.
Returns:
Constructed Buffer object.
make_dims_unknown
make_dims_unknown(self: Self) -> NDBuffer[type, rank, create_unknown[stdlib::builtin::int::Int](), address_space]
Rebinds the NDBuffer to one with unknown shape.
Returns:
The rebound NDBuffer with unknown shape.
bytecount
bytecount(self: Self) -> Int
Returns the size of the NDBuffer in bytes.
Returns:
The size of the NDBuffer in bytes.
zero
zero(self: Self)
Sets all bytes of the NDBuffer to 0.
Constraints:
The buffer must be contiguous.
tofile
tofile(self: Self, path: Path)
Write values to a file.
Args:
- path (
Path
): Path to the output file.
fill
fill(self: Self, val: SIMD[type, 1])
Assigns val to all elements in the Buffer.
The fill is performed in chunks of size N, where N is the native SIMD width of type on the system.
Args:
- val (
SIMD[type, 1]
): The value to store.
aligned_stack_allocation
static aligned_stack_allocation[alignment: Int]() -> Self
Constructs an NDBuffer instance backed by stack allocated memory space.
Parameters:
- alignment (
Int
): Address alignment requirement for the allocation.
Returns:
Constructed NDBuffer with the allocated space.
stack_allocation
static stack_allocation() -> Self
Constructs an NDBuffer instance backed by stack allocated memory space.
Returns:
Constructed NDBuffer with the allocated space.
prefetch
prefetch[params: PrefetchOptions](self: Self, *idx: Int)
Prefetches the data at the given index.
Parameters:
- params (
PrefetchOptions
): The prefetch configuration.
Args:
- *idx (
Int
): The N-D index of the prefetched location.
prefetch[params: PrefetchOptions](self: Self, indices: StaticIntTuple[rank])
Prefetches the data at the given index.
Parameters:
- params (
PrefetchOptions
): The prefetch configuration.
Args:
- indices (
StaticIntTuple[rank]
): The N-D index of the prefetched location.