Mojo struct

LayoutTensor

@register_passable(trivial) struct LayoutTensor[dtype: DType, layout: Layout, rank: Int = $1.rank(), /, *, address_space: AddressSpace = 0, element_layout: Layout = __init__[::Origin[{False}],::Origin[{False}]](IntTuple(1), IntTuple(1)), layout_bitwidth: Int = Int(bitwidthof[::DType,__mlir_type.!kgen.target]()), masked: Bool = False, alignment: Int = Int(alignof[::DType,__mlir_type.!kgen.target]())]

This is a Tensor type that has a specified memory layout and rank. The following example demonstrate a LayoutTensor of float32 with a row major layout of shape (5, 4).

alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()
alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()

Parameters

dtype (DType): The data type of the underlying pointer.
layout (Layout): The memory layout of the Tensor.
rank (Int): The rank of the Tensor.
address_space (AddressSpace): The address space of the underlying pointer.
element_layout (Layout): The memory layout of each element in the Tensor.
layout_bitwidth (Int): The bitwidth of each dimension of runtime layout.
masked (Bool): If true the tensor is masked and runtime layouts determine the shape.
alignment (Int): Alignment of the data pointer.

Aliases

index_type = _get_index_type(layout, address_space):
uint_type = SIMD[_get_unsigned_type(layout, address_space), 1]:
element_size = element_layout.size():
element_type = SIMD[dtype, element_layout.size()]:

Fields

ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment]):
runtime_layout (RuntimeLayout[layout, bitwidth=layout_bitwidth]):
runtime_element_layout (RuntimeLayout[element_layout]):

Implemented traits

AnyType, CollectionElement, CollectionElementNew, Copyable, ExplicitlyCopyable, Movable, Stringable, UnknownDestructibility, Writable

Methods

`init`

@implicit __init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]) -> Self

Create a LayoutTensor with an UnsafePointer. Expect layout to be fully static.

Args:

ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.

__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=bitwidth]) -> Self

Create a LayoutTensor with an UnsafePointer. Expect element layout to be fully static.

Args:

ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.
runtime_layout (RuntimeLayout[layout, bitwidth=bitwidth]): The runtime layout of the LayoutTensor.

__init__(ptr: UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin], runtime_layout: RuntimeLayout[layout, bitwidth=layout_bitwidth], element_runtime_layout: RuntimeLayout[element_layout]) -> Self

Create a LayoutTensor with an UnsafePointer, a runtime layout of the Tensor, the runtime layout of each element.

Args:

ptr (UnsafePointer[SIMD[dtype, 1], address_space=address_space, alignment=alignment, mut=mut, origin=origin]): The UnsafePointer pointing to the underlying data.
runtime_layout (RuntimeLayout[layout, bitwidth=layout_bitwidth]): The runtime layout of the LayoutTensor.
element_runtime_layout (RuntimeLayout[element_layout]): The runtime layout of each element.

`getitem`

__getitem__(self, *dims: Int) -> SIMD[dtype, element_layout.size()]

Get the element of the tensor with a specified index. Note that the size of index has to match the rank of the tensor.

Args:

*dims (Int): The indexes that specify which element to retrieve.

`setitem`

__setitem__(self, d0: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

d0 (Int): The first dimensional index.
val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

d0 (Int): The first dimensional index.
d1 (Int): The second dimensional index.
val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, d2: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

d0 (Int): The first dimensional index.
d1 (Int): The second dimensional index.
d2 (Int): The third dimensional index.
val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

__setitem__(self, d0: Int, d1: Int, d2: Int, d3: Int, val: SIMD[dtype, element_layout.size()])

Set the element of the tensor with a specified index and value.

Args:

d0 (Int): The first dimensional index.
d1 (Int): The second dimensional index.
d2 (Int): The third dimensional index.
d3 (Int): The fourth dimensional index.
val (SIMD[dtype, element_layout.size()]): The value writing to the tensor.

`add`

__add__(self, other: SIMD[dtype, 1]) -> Self

Add the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__add__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an addition with another LayoutTensor and return the added tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

`sub`

__sub__(self, other: SIMD[dtype, 1]) -> Self

Subtract the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__sub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an subtraction with another LayoutTensor and return the subtracted tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

`mul`

__mul__(self, other: SIMD[dtype, 1]) -> Self

Multiply the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__mul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Perform a multiplication with another LayoutTensor and return the resulting tensor.

Currently, only tensors of the same shape are supported if the ranks are the same. Additionally, tensors of rank-2 are supported.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be multiplied with.

Returns:

The resulting tensor after multiplication.

`truediv`

__truediv__(self, other: SIMD[dtype, 1]) -> Self

Truediv the LayoutTensor with a scalar value. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__truediv__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

Do an truediv with another LayoutTensor and return the divided tensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

`iadd`

__iadd__(self, other: SIMD[dtype, 1])

Adds scalar value to the LayoutTensor. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__iadd__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Do an addition with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

`isub`

__isub__(self, other: SIMD[dtype, 1])

Subtract scalar value from the LayoutTensor. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__isub__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Subtracts other from the LayoutTensor. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be subtract from.

`imul`

__imul__(self, other: SIMD[dtype, 1])

Multiply the LayoutTensor with a scalar value inplace. The scalar value will be broadcasted to the entire tensor.

Args:

other (SIMD[dtype, 1]): The scalar value.

__imul__[other_layout: Layout](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth])

Do a multiplication with another LayoutTensor in place. Currently only support tensors of the same shape if the rank is the same and also tensors of rank-2.

Parameters:

other_layout (Layout): The layout of the other tensor.

Args:

other (LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]): The other tensor to be added to.

`copy`

copy(self) -> Self

Explicitly copy the other LayoutTensor.

Returns:

A copy of the value.

`bitcast`

bitcast[new_type: DType, /, address_space: AddressSpace = address_space, element_layout: Layout = element_layout](self) -> LayoutTensor[new_type, layout, layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]

Bitcast the underlying pointer to a new data type.

Parameters:

new_type (DType): The new data type it is casting to.
address_space (AddressSpace): The address space of the returned LayoutTensor.
element_layout (Layout): The element layout of the returned LayoutTensor.

`__elementwise_unary`

__elementwise_unary[func: fn(SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], inplace: Bool = False](self) -> Self

`__elementwise_binary_with_broadcast`

__elementwise_binary_with_broadcast[func: fn(SIMD[dtype, element_layout.size()], SIMD[dtype, element_layout.size()]) capturing -> SIMD[dtype, element_layout.size()], other_layout: Layout, inplace: Bool = False](self, other: LayoutTensor[dtype, other_layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth]) -> Self

`load`

load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]

Load a value from a specified location.

Parameters:

width (Int): The simd width of the returned value.

Args:

m (Int): The m dimension of the value.
n (Int): The n dimension of the value.

`prefetch`

prefetch(self, m: Int, n: Int)

Do software prefetching of a value from a specified location.

Args:

m (Int): The m dimension of the value.
n (Int): The n dimension of the value.

`aligned_load`

aligned_load[width: Int](self, m: Int, n: Int) -> SIMD[dtype, width]

Do a load with a specified alignment base on the dtype and simd width.

Parameters:

width (Int): The simd width if the returned value.

Args:

m (Int): The m dimension of the value.
n (Int): The n dimension of the value.

`store`

store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])

Store a value to a specified location.

Parameters:

width (Int): The simd width of the stored value.

Args:

m (Int): The m dimensional index to the tensor.
n (Int): The n dimensional index to the tensor.
val (SIMD[dtype, width]): The value to be stored.

`aligned_store`

aligned_store[width: Int](self, m: Int, n: Int, val: SIMD[dtype, width])

Do a store with a specified alignment base on the dtype and simd width.

Parameters:

width (Int): The simd width if the stored value.

Args:

m (Int): The m dimensional index to the tensor.
n (Int): The n dimensional index to the tensor.
val (SIMD[dtype, width]): The value to be stored.

`stack_allocation`

static stack_allocation[*, alignment: Int = alignment]() -> Self

Allocates stack memory for a LayoutTensor. Expects layout to be fully static.

Parameters:

alignment (Int): Alignment of the allocation. It must be multiple of the tensor's alignment, which is the minimum required by arch, instruction, etc.

`shape`

static shape[idx: Int]() -> Int

Returns the shape of the tensor given a index.

Parameters:

idx (Int): The index to the shape of the tensor.

`stride`

static stride[idx: Int]() -> Int

Returns the stride of the tensor given a index.

Parameters:

idx (Int): The index to the stride of the tensor.

`dim`

dim(self, idx: Int) -> Int

Returns the dimension of the tensor given a index.

Arguments: idx: The index to the dimension of the tensor.

`coalesce`

coalesce(self) -> LayoutTensor[dtype, coalesce(layout, False), coalesce(layout, False).rank(), address_space=address_space, element_layout=element_layout]

Returns a LayoutTensor with a coalesced Layout.

`tile`

tile[*tile_sizes: Int](self, *tile_coords: Int) -> LayoutTensor[dtype, _compute_tile_layout[*::Int]().__getitem__(0), _compute_tile_layout[*::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]

Tiles the layout and returns a tensor tile with the specified tile_sizes at specific tile coordinates.

Example:

Memory Layout of
                [1 2 3 4]
                [2 3 4 5]
                [5 4 3 2]
                [1 1 1 1]

tile[2, 2](1, 0) will give you
                [5 4]
                [1 1]
Memory Layout of
                [1 2 3 4]
                [2 3 4 5]
                [5 4 3 2]
                [1 1 1 1]

tile[2, 2](1, 0) will give you
                [5 4]
                [1 1]

Parameters:

*tile_sizes (Int): The tile sizes of the returned LayoutTensor.

Args:

*tile_coords (Int): The tile coordinate. This refer to the coordinate of the tile after the tiled layout. Consider the following example.

`tiled_iterator`

tiled_iterator[*tile_sizes: Int, *, axis: Int = 0](self, *tile_coords: Int) -> LayoutTensorIter[dtype, _compute_tile_layout[*::Int]().__getitem__(0), address_space=address_space, axis=OptionalReg(axis), layout_bitwidth=layout_bitwidth, masked=masked if masked else _tile_is_masked[layout::layout::Layout,*::Int]()]

Returns the tiled iterator of the LayoutTensor.

Parameters:

*tile_sizes (Int): Tile sizes of each tile the iterator will iterate through.
axis (Int): Axis of the LayoutTensor the iterator will iterate through.

Args:

*tile_coords (Int): The tile coordinate that the iterator will point to.

`split`

split[count: Int, axis: Int = 0](self) -> StaticTuple[LayoutTensor[dtype, _compute_tile_layout[::Int,::Int]().__getitem__(0), _compute_tile_layout[::Int,::Int]().__getitem__(0).rank(), address_space=address_space, element_layout=element_layout], count]

Split the LayoutTensor along a axis and return an InlineArray of LayoutTensor.

Parameters:

count (Int): Number of portion to split.
axis (Int): The axis where the split is applied to.

split[axis: Int = 0, alignment: Int = 1](self, count: Int, idx: Int) -> LayoutTensor[dtype, layout.make_shape_unknown[::Int](), layout.make_shape_unknown[::Int]().rank(), address_space=address_space, element_layout=element_layout]

`distribute`

distribute[threads_layout: Layout, axis: OptionalReg[Int] = OptionalReg(None), swizzle: OptionalReg[Swizzle] = OptionalReg(None), submode_axis: OptionalReg[Int] = OptionalReg(None)](self, thread_id: UInt) -> LayoutTensor[dtype, _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1), _compute_distribute_layout[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]().__getitem__(1).rank(), address_space=address_space, element_layout=element_layout, masked=masked if masked else _distribute_is_masked[layout::layout::Layout,layout::layout::Layout,stdlib::collections::optional::OptionalReg[::Int]]()]

Distribute tiled workload to threads.

If the axis is given, for example, using axis = 0 for 4 threads: TH_0 TH_2 TH_1 TH_3 This means the tensor is only distributed to threads in axis = 0, i.e., threads 0 and 1. Threads 2 and 3 gets the same tile as 0 and 1, respectively. This is useful when threads load same vectors from a row in A matrix and some threads share the same vector.

`vectorize`

vectorize[*vector_shape: Int](self) -> LayoutTensor[dtype, coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True), coalesce(_compute_tile_layout[*::Int]().__getitem__(1), True).rank(), address_space=address_space, element_layout=_divide_tiles[*::Int]().__getitem__(0), masked=masked]

`__compute_slice_layout`

static __compute_slice_layout(d0_slice: Slice, d1_slice: Slice) -> Layout

static __compute_slice_layout(slice_0: Slice, slice_1: Slice, slice_0_axis: Int, slice_1_axis: Int) -> Layout

static __compute_slice_layout(slice_0: Slice, slice_0_axis: Int) -> Layout

`slice`

slice[d0_slice: Slice, d1_slice: Slice](self) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice), __compute_slice_layout(d0_slice, d1_slice).rank(), address_space=address_space, element_layout=element_layout]

slice[d0_slice: Slice, d1_slice: Slice, slice_indices: IndexList[2], __offset_dims: Int = rank.__sub__(2)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)), __compute_slice_layout(d0_slice, d1_slice, slice_indices.__getitem__[::Indexer](0), slice_indices.__getitem__[::Indexer](1)).rank(), address_space=address_space, element_layout=element_layout]

`slice_1d`

slice_1d[d0_slice: Slice, slice_indices: IndexList[1], __offset_dims: Int = rank.__sub__(1)](self, offsets: IndexList[__offset_dims]) -> LayoutTensor[dtype, __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)), __compute_slice_layout(d0_slice, slice_indices.__getitem__[::Indexer](0)).rank(), address_space=address_space, element_layout=element_layout]

`transpose`

transpose[M: Int = shape[::Int](), N: Int = shape[::Int]()](self) -> LayoutTensor[dtype, composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))), composition(layout, __init__[::Origin[{False}],::Origin[{False}]](IntTuple(N, M), IntTuple(M, 1))).rank(), address_space=address_space, element_layout=element_layout]

`reshape`

reshape[dst_layout: Layout](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout, masked=masked]

`composition`

composition[rhs_layout: Layout, dst_layout: Layout = composition(layout, $0)](self) -> LayoutTensor[dtype, dst_layout, dst_layout.rank(), address_space=address_space, element_layout=element_layout]

`distance`

distance[_uint_dtype: DType = uint32 if address_space.__eq__(3) else uint64](self, addr: UnsafePointer[SIMD[dtype, 1], address_space=address_space]) -> SIMD[_uint_dtype, 1]

Returns the distance from the input address.

distance[_layout: Layout, _uint_dtype: DType = _get_unsigned_type($0, address_space)](self, src: LayoutTensor[dtype, _layout, _layout.rank(), address_space=address_space]) -> SIMD[_uint_dtype, 1]

Returns the distance from the input address.

`__get_element_idx`

__get_element_idx[elem_i: Int](self) -> Int

`copy_from`

copy_from(self, other: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment])

`copy_from_async`

copy_from_async[is_masked: Bool = False, swizzle: OptionalReg[Swizzle] = OptionalReg(None), fill: Fill = 0, eviction_policy: CacheEviction = 0](self, src: LayoutTensor[dtype, layout, rank, address_space=address_space, element_layout=element_layout, layout_bitwidth=layout_bitwidth, masked=masked, alignment=alignment], src_idx_bound: SIMD[_get_index_type(layout, address_space), 1] = SIMD(0), base_offset: SIMD[_get_unsigned_type(layout, address_space), 1] = SIMD(0))

`fill`

fill(self, val: SIMD[dtype, 1]) -> Self

`str`

__str__(self) -> String

`write_to`

write_to[W: Writer](self, mut writer: W)

Format 2D tensor in 2D, otherwise print all values in column major coordinate order.

Was this page helpful?

Thank you! We'll create more content like this.

Thank you for helping us improve!

Parameters​

Aliases​

Fields​

Implemented traits​

Methods​

__init__​

__getitem__​

__setitem__​

__add__​

__sub__​

__mul__​

__truediv__​

__iadd__​

__isub__​

__imul__​

copy​

bitcast​

__elementwise_unary​

__elementwise_binary_with_broadcast​

load​

prefetch​

aligned_load​

store​

aligned_store​

stack_allocation​

shape​

stride​

dim​

coalesce​

tile​

tiled_iterator​

split​

distribute​

vectorize​

__compute_slice_layout​

slice​

slice_1d​

transpose​

reshape​

composition​

distance​

__get_element_idx​

copy_from​

copy_from_async​

fill​

__str__​

write_to​

Parameters

Aliases

Fields

Implemented traits

Methods

`init`

`getitem`

`setitem`

`add`

`sub`

`mul`

`truediv`

`iadd`

`isub`

`imul`

`copy`

`bitcast`

`__elementwise_unary`

`__elementwise_binary_with_broadcast`

`load`

`prefetch`

`aligned_load`

`store`

`aligned_store`

`stack_allocation`

`shape`

`stride`

`dim`

`coalesce`

`tile`

`tiled_iterator`

`split`

`distribute`

`vectorize`

`__compute_slice_layout`

`slice`

`slice_1d`

`transpose`

`reshape`

`composition`

`distance`

`__get_element_idx`

`copy_from`

`copy_from_async`

`fill`

`str`

`write_to`