Mojo struct
LayoutTensor
@register_passable(trivial)
struct LayoutTensor[dtype: DType, layout: Layout, rank: Int = $1.rank(), /, *, address_space: AddressSpace = 0, element_layout: Layout = __init__[::Origin[{False}],::Origin[{False}]](IntTuple(1), IntTuple(1)), layout_bitwidth: Int = Int(bitwidthof[::DType,__mlir_type.!kgen.target]()), masked: Bool = False, alignment: Int = Int(alignof[::DType,__mlir_type.!kgen.target]())]
This is a Tensor type that has a specified memory layout and rank. The following example demonstrate a LayoutTensor of float32 with a row major layout of shape (5, 4).
alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()
alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()
Parameters
- dtype (
DType
): The data type of the underlying pointer. - layout (
Layout
): The memory layout of the Tensor. - rank (
Int
): The rank of the Tensor. - address_space (
AddressSpace
): The address space of the underlying pointer. - element_layout (
Layout
): The memory layout of each element in the Tensor. - layout_bitwidth (
Int
): The bitwidth of each dimension of runtime layout. - masked (
Bool
): If true the tensor is masked and runtime layouts determine the shape. - alignment (
Int
): Alignment of the data pointer.
Aliases
index_type = _get_index_type(layout, address_space)
:uint_type = SIMD[_get_unsigned_type(layout, address_space), 1]
:element_size = element_layout.size()
:element_type = SIMD[dtype, element_layout.size()]
:
Fields
- ptr (
UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment]
): - runtime_layout (
RuntimeLayout[layout, bitwidth=layout_bitwidth]
): - runtime_element_layout (
RuntimeLayout[element_layout]
):
Implemented traits
AnyType
,
CollectionElement
,
CollectionElementNew
,
Copyable
,
ExplicitlyCopyable
,
Movable
,
Stringable
,
UnknownDestructibility
,
Writable
Methods
__init__
@implicit
__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]) -> Self
Create a LayoutTensor with an UnsafePointer. Expect layout to be fully static.
Args:
- ptr (
UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]
): The UnsafePointer pointing to the underlying data.
__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=bitwidth]) -> Self
Create a LayoutTensor with an UnsafePointer. Expect element layout to be fully static.
Args:
- ptr (
UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]
): The UnsafePointer pointing to the underlying data. - runtime_layout (
RuntimeLayout[layout, bitwidth=bitwidth]
): The runtime layout of the LayoutTensor.
__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=layout_bitwidth], element_runtime_layout: RuntimeLayout[element_layout]) -> Self
Create a LayoutTensor with an UnsafePointer, a runtime layout of the Tensor, the runtime layout of each element.
Args:
- ptr (
UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]
): The UnsafePointer pointing to the underlying data. - runtime_layout (
RuntimeLayout[layout, bitwidth=layout_bitwidth]
): The runtime layout of the LayoutTensor. - element_runtime_layout (
RuntimeLayout[element_layout]
): The runtime layout of each element.
__getitem__
__getitem__(self, *dims: Int) -> SIMD[dtype, element_layout.size()]
Get the element of the tensor with a specified index. Note that the size of index has to match the rank of the tensor.
Args:
- *dims (
Int
): The indexes that specify which element to retrieve.
__setitem__
__setitem__(self, d0: Int, val: SIMD[dtype, element_layout.size()])
Set the element of the tensor with a specified index and value.
Args:
- d0 (
Int
): The first dimensional index. - val (
SIMD[dtype, element_layout.size()]
): The value writing to the tensor.
__setitem__(self, d0: Int, d1: Int, val: SIMD[dtype, element_layout.size()])
Set the element of the tensor with a specified index and value.
Args:
- d0 (
Int
): The first dimensional index. - d1 (
Int
): The second dimensional index. - val (
SIMD[dtype, element_layout.size()]
): The value writing to the tensor.
__setitem__(self, d0: Int, d1: Int, d2: Int, val: SIMD[dtype, element_layout.size()])
Set the element of the tensor with a specified index and value.
Args:
- d0 (
Int
): The first dimensional index. - d1 (
Int
): The second dimensional index. - d2 (
Int
): The third dimensional index. - val (
SIMD[dtype, element_layout.size()]
): The value writing to the tensor.
__setitem__(self, d0: Int, d1: Int, d2: Int, d3: Int, val: SIMD[dtype, element_layout.size()])
Set the element of the tensor with a specified index and value.
Args:
- d0 (
Int
): The first dimensional index. - d1 (
Int
): The second dimensional index. - d2 (
Int
): The third dimensional index. - d3 (
Int
): The fourth dimensional index. - val (
SIMD[dtype, element_layout.size()]
): The value writing to the tensor.
__add__
__add__(self, other: SIMD[dtype, 1]) -> Self
Add the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__add__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self
Do an addition with another LayoutTensor and return the added tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be added to.
__sub__
__sub__(self, other: SIMD[dtype, 1]) -> Self
Subtract the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__sub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self
Do an subtraction with another LayoutTensor and return the subtracted tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be subtract from.
__mul__
__mul__(self, other: SIMD[dtype, 1]) -> Self
Multiply the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__mul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self
Perform a multiplication with another LayoutTensor and return the resulting tensor.
Currently, only tensors of the same shape are supported if the ranks are the same. Additionally, tensors of rank-2 are supported.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be multiplied with.
Returns:
The resulting tensor after multiplication.
__truediv__
__truediv__(self, other: SIMD[dtype, 1]) -> Self
Truediv the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__truediv__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self
Do an truediv with another LayoutTensor and return the divided tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be subtract from.
__iadd__
__iadd__(self, other: SIMD[dtype, 1])
Adds scalar value to the LayoutTensor. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__iadd__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])
Do an addition with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be added to.
__isub__
__isub__(self, other: SIMD[dtype, 1])
Subtract scalar value from the LayoutTensor. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__isub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])
Subtracts other from the LayoutTensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be subtract from.
__imul__
__imul__(self, other: SIMD[dtype, 1])
Multiply the LayoutTensor with a scalar value inplace. The scalar value will be broadcasted to the entire tensor.
Args:
- other (
SIMD[dtype, 1]
): The scalar value.
__imul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])
Do a multiplication with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.
Parameters:
- other_layout (
Layout
): The layout of the other tensor.
Args:
- other (
LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]
): The other tensor to be added to.
copy
copy(self) -> Self
Explicitly copy the other LayoutTensor.
Returns:
A copy of the value.
bitcast
bitcast[new_type: DType, /, address_space: AddressSpace = address_space, element_layout: Layout = element_layout](self) -> LayoutTensor[new_type, layout, layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]
Bitcast the underlying pointer to a new data type.
Parameters:
- new_type (
DType
): The new data type it is casting to. - address_space (
AddressSpace
): The address space of the returned LayoutTensor. - element_layout (
Layout
): The element layout of the returned LayoutTensor.
__elementwise_unary
__elementwise_unary[func: fn(SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], inplace: Bool = False](self) -> Self
__elementwise_binary_with_broadcast
__elementwise_binary_with_broadcast[func: fn(SIMD[dtype, element_layout.size()], SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], other_layout: Layout, inplace: Bool = False](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self
load
load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]
Load a value from a specified location.
Parameters:
- width (
Int
): The simd width of the returned value.
Args:
- m (
Int
): The m dimension of the value. - n (
Int
): The n dimension of the value.
prefetch
prefetch(self, m: Int, n: Int)
Do software prefetching of a value from a specified location.
Args:
- m (
Int
): The m dimension of the value. - n (
Int
): The n dimension of the value.
aligned_load
aligned_load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]
Do a load with a specified alignment base on the dtype and simd width.
Parameters:
- width (
Int
): The simd width if the returned value.
Args:
- m (
Int
): The m dimension of the value. - n (
Int
): The n dimension of the value.
store
store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])
Store a value to a specified location.
Parameters:
- width (
Int
): The simd width of the stored value.
Args:
- m (
Int
): The m dimensional index to the tensor. - n (
Int
): The n dimensional index to the tensor. - val (
SIMD[dtype, width]
): The value to be stored.
aligned_store
aligned_store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])
Do a store with a specified alignment base on the dtype and simd width.
Parameters:
- width (
Int
): The simd width if the stored value.
Args:
- m (
Int
): The m dimensional index to the tensor. - n (
Int
): The n dimensional index to the tensor. - val (
SIMD[dtype, width]
): The value to be stored.
stack_allocation
static stack_allocation[*, alignment: Int = alignment]() -> Self
Allocates stack memory for a LayoutTensor. Expects layout to be fully static.
Parameters:
- alignment (
Int
): Alignment of the allocation. It must be multiple of the tensor's alignment, which is the minimum required by arch, instruction, etc.
shape
static shape[idx: Int]() -> Int
Returns the shape of the tensor given a index.
Parameters:
- idx (
Int
): The index to the shape of the tensor.
stride
static stride[idx: Int]() -> Int
Returns the stride of the tensor given a index.
Parameters:
- idx (
Int
): The index to the stride of the tensor.
dim
dim(self, idx: Int) -> Int
Returns the dimension of the tensor given a index.
Arguments: idx: The index to the dimension of the tensor.
coalesce
coalesce(self) -> LayoutTensor[dtype, coalesce(layout, False), coalesce(layout, False).rank(), address_space=address_space, element_layout=element_layout]
Returns a LayoutTensor with a coalesced Layout.
tile
tile[*tile_sizes: Int](self, *tile_coords: Int) -> LayoutTensor[dtype, _compute_tile_layout[*::Int]().__getitem__(0), _compute_tile_layout[*::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]
Tiles the layout and returns a tensor tile with the specified tile_sizes at specific tile coordinates.
Example:
Memory Layout of
[1 2 3 4]
[2 3 4 5]
[5 4 3 2]
[1 1 1 1]
tile[2, 2](1, 0) will give you
[5 4]
[1 1]
Memory Layout of
[1 2 3 4]
[2 3 4 5]
[5 4 3 2]
[1 1 1 1]
tile[2, 2](1, 0) will give you
[5 4]
[1 1]
Parameters:
- *tile_sizes (
Int
): The tile sizes of the returned LayoutTensor.
Args:
- *tile_coords (
Int
): The tile coordinate. This refer to the coordinate of the tile after the tiled layout. Consider the following example.
tiled_iterator
tiled_iterator[*tile_sizes: Int, *, axis: Int = 0](self, *tile_coords: Int) -> LayoutTensorIter[dtype, _compute_tile_layout[*::Int]().__getitem__(0), address_space=address_space, axis=OptionalReg(axis), layout_bitwidth=layout_bitwidth, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]
Returns the tiled iterator of the LayoutTensor.
Parameters:
- *tile_sizes (
Int
): Tile sizes of each tile the iterator will iterate through. - axis (
Int
): Axis of the LayoutTensor the iterator will iterate through.
Args:
- *tile_coords (
Int
): The tile coordinate that the iterator will point to.
split
split[count: Int, axis: Int = 0](self) -> StaticTuple[LayoutTensor[dtype, _compute_tile_layout[::Int,::Int]().__getitem__(0), _compute_tile_layout[::Int,::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout], count]
Split the LayoutTensor along a axis and return an InlineArray of LayoutTensor.
Parameters:
- count (
Int
): Number of portion to split. - axis (
Int
): The axis where the split is applied to.
split[axis: Int = 0, alignment: Int = 1](self, count: Int, idx: Int) -> LayoutTensor[dtype, layout.make_shape_unknown[::Int](), layout.make_shape_unknown[::Int]().rank(), address_space=address_space, element_layout=element_layout]
distribute
distribute[threads_layout: Layout, axis: OptionalReg[Int] = OptionalReg(None), swizzle: OptionalReg[Swizzle] = OptionalReg(None), submode_axis: OptionalReg[Int] = OptionalReg(None)](self, thread_id: UInt) -> LayoutTensor[dtype, _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1), _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _distribute_is_masked[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]()]
Distribute tiled workload to threads.
If the axis
is given, for example, using axis = 0
for 4 threads:
TH_0 TH_2
TH_1 TH_3
This means the tensor is only distributed to threads in axis = 0, i.e.,
threads 0 and 1. Threads 2 and 3 gets the same tile as 0 and 1, respectively.
This is useful when threads load same vectors from a row in A matrix and
some threads share the same vector.
vectorize
vectorize[*vector_shape: Int](self) -> LayoutTensor[dtype, coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True), coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True).rank(), address_space=address_space, element_layout=_divide_tiles[*::Int]().__getitem__(0), masked=masked]
__compute_slice_layout
static __compute_slice_layout(d0_slice: Slice, d1_slice: Slice) -> Layout
static __compute_slice_layout(slice_0: Slice, slice_1: Slice, slice_0_axis: Int, slice_1_axis: Int) -> Layout
static __compute_slice_layout(slice_0: Slice, slice_0_axis: Int) -> Layout
slice
slice[d0_slice: Slice, d1_slice: Slice](self) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice), __compute_slice_layout(d0_slice, d1_slice).rank(), address_space=address_space, element_layout=element_layout]
slice[d0_slice: Slice, d1_slice: Slice, slice_indices: IndexList[2], __offset_dims: Int = rank.__sub__(2)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)), __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)).rank(), address_space=address_space, element_layout=element_layout]
slice_1d
slice_1d[d0_slice: Slice, slice_indices: IndexList[1], __offset_dims: Int = rank.__sub__(1)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)), __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)).rank(), address_space=address_space, element_layout=element_layout]
transpose
transpose[M: Int = shape[::Int](), N: Int = shape[::Int]()](self) -> LayoutTensor[dtype, composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))), composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))).rank(), address_space=address_space, element_layout=element_layout]
reshape
reshape[dst_layout: Layout](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]
composition
composition[rhs_layout: Layout, dst_layout: Layout = composition(layout, $0)](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout]
distance
distance[_uint_dtype: DType = uint32 if address_space.__eq__(3) else uint64](self, addr: UnsafePointer[SIMD[dtype, 1], address_space=address_space]) -> SIMD[_uint_dtype, 1]
Returns the distance from the input address.
distance[_layout: Layout, _uint_dtype: DType = _get_unsigned_type($0, address_space)](self, src: LayoutTensor[dtype, _layout, _layout.rank(), address_space=address_space]) -> SIMD[_uint_dtype, 1]
Returns the distance from the input address.
__get_element_idx
__get_element_idx[elem_i: Int](self) -> Int
copy_from
copy_from(self, other: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])
copy_from_async
copy_from_async[is_masked: Bool = False, swizzle: OptionalReg[Swizzle] = OptionalReg(None), fill: Fill = 0, eviction_policy: CacheEviction = 0](self, src: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_idx_bound: SIMD[_get_index_type(layout, address_space), 1] = SIMD(0), base_offset: SIMD[_get_unsigned_type(layout, address_space), 1] = SIMD(0))
fill
fill(self, val: SIMD[dtype, 1]) -> Self
__str__
__str__(self) -> String
write_to
write_to[W: Writer](self, mut writer: W)
Format 2D tensor in 2D, otherwise print all values in column major coordinate order.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!