Mojo struct

ChunkedMask

struct ChunkedMask[local_window_size: Int]

Mask implementing Chunked attention.

This groups the mask into chunks of size local_window_size. Considering the following case:

Q_len = 7
K_len = 10
local_window_size = 4

The mask will be applied as follows: K > 0 1 2 3 4 5 6 7 8 9 Q v x--------------------x 0 | 1 1 1 1 0 0 0 0 0 0 1 | 0 0 0 0 1 1 1 1 0 0 2 | 0 0 0 0 1 1 1 1 0 0 3 | 0 0 0 0 1 1 1 1 0 0 4 | 0 0 0 0 1 1 1 1 0 0 5 | 0 0 0 0 0 0 0 0 1 1 6 | 0 0 0 0 0 0 0 0 1 1

Implemented traits

AnyType, Copyable, DevicePassable, ImplicitlyCopyable, ImplicitlyDestructible, MHAMask, Movable, RegisterPassable, TrivialRegisterPassable

`comptime` members

`apply_log2e_after_mask`

comptime apply_log2e_after_mask = False

`check_mask_during_decoding`

comptime check_mask_during_decoding = True

`device_type`

comptime device_type = ChunkedMask[local_window_size]

`mask_out_of_bound`

comptime mask_out_of_bound = True

`mask_safe_out_of_bounds`

comptime mask_safe_out_of_bounds = True

Methods

`get_type_name`

static get_type_name() -> String

Returns:

String

`name`

static name() -> String

Returns:

String

`mask`

mask[dtype: DType, width: Int, //, *, element_type: DType = DType.uint32](self, coord: IndexList[4, element_type=element_type], score_vec: SIMD[dtype, width]) -> SIMD[dtype, width]

Returns:

SIMD[dtype, width]

`status`

status[*, element_type: DType = DType.uint32](self, tile_offset: IndexList[2, element_type=element_type], tile_size: IndexList[2, element_type=element_type]) -> TileMaskStatus

Returns:

TileMaskStatus

`start_column`

start_column[BM: Int, BN: Int, page_size: Int](self, row: UInt32) -> UInt32

Returns:

UInt32

`total_iters`

total_iters[BM: Int, BN: Int, page_size: Int](self, row: UInt32, num_cols: UInt32) -> UInt32

Returns:

UInt32

`count_nonfull_sets`

static count_nonfull_sets(BM: Int, BN: Int) -> Int

Returns:

Int

`last_masked_set_end`

last_masked_set_end[BM: Int, BN: Int, page_size: Int](self, row: UInt32, num_cols: UInt32) -> UInt32

Returns:

UInt32

`masked_set_ends`

masked_set_ends[BM: Int, BN: Int, page_size: Int](self, row: UInt32, num_cols: UInt32) -> StaticTuple[UInt32, ChunkedMask.count_nonfull_sets(BM, BN)]

Returns:

StaticTuple[UInt32, ChunkedMask.count_nonfull_sets(BM, BN)]

`nonfull_sets`

static nonfull_sets[BM: Int, BN: Int]() -> StaticTuple[TileMaskStatus, ChunkedMask.count_nonfull_sets(BM, BN)]

Returns:

StaticTuple[TileMaskStatus, ChunkedMask.count_nonfull_sets(BM, BN)]

`mask_strategies`

static mask_strategies[BM: Int, BN: Int]() -> StaticTuple[MaskStrategy, ChunkedMask.count_nonfull_sets(BM, BN)]

Returns:

StaticTuple[MaskStrategy, ChunkedMask.count_nonfull_sets(BM, BN)]

Implemented traits​

comptime members​

apply_log2e_after_mask​

check_mask_during_decoding​

device_type​

mask_out_of_bound​

mask_safe_out_of_bounds​

Methods​

get_type_name​

name​

mask​

status​

start_column​

total_iters​

count_nonfull_sets​

last_masked_set_end​

masked_set_ends​

nonfull_sets​

mask_strategies​

Implemented traits

`comptime` members

`apply_log2e_after_mask`

`check_mask_during_decoding`

`device_type`

`mask_out_of_bound`

`mask_safe_out_of_bounds`

Methods

`get_type_name`

`name`

`mask`

`status`

`start_column`

`total_iters`

`count_nonfull_sets`

`last_masked_set_end`

`masked_set_ends`

`nonfull_sets`

`mask_strategies`