Mojo struct

ContinuousBatchingKVCacheCollection

struct ContinuousBatchingKVCacheCollection[dtype_: DType, kv_params_: KVCacheStaticParams]

This is a "view" of the cache for the given sequences in the batch.

This object does not own the underlying buffers in k_cache and v_cache, it's borrowing them from the BlockWrappers in our KVCacheManager. It does own the Pointer[NDBuffer[dtype, 3]] and valid_lengths buffer

Parameters

dtype_ (DType): The dtype of the kv-cache.
kv_params_ (KVCacheStaticParams): The kv-cache static parameters.

Fields

cache_lengths (NDBuffer[uint32, 1, MutableAnyOrigin]):
lookup_table (NDBuffer[uint32, 1, MutableAnyOrigin]):
blocks (NDBuffer[dtype_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]):
max_seq_length (SIMD[uint32, 1]):
max_cache_length (SIMD[uint32, 1]):
kv_cache_dynamic_shape (IndexList[4]):
kv_cache_dynamic_strides (IndexList[4]):

Implemented traits

AnyType, Copyable, KVCollectionT, Movable, UnknownDestructibility

Aliases

`blocks_shape`

alias blocks_shape = DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size))

`blocks_stride`

alias blocks_stride = _strides_from_shape[::DimList,::Int]()

`blocks_type`

alias blocks_type = NDBuffer[dtype_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]

`CacheType`

alias CacheType = ContinuousBatchingKVCache[dtype_, kv_params_]

`dtype`

alias dtype = dtype_

`kv_params`

alias kv_params = kv_params_

`name_str`

alias name_str = "continuous_batching"

Methods

`init`

__init__(out self, blocks: NDBuffer[dtype_, 6, MutableAnyOrigin], cache_lengths: NDBuffer[uint32, 1, MutableAnyOrigin], lookup_table: NDBuffer[uint32, 1, MutableAnyOrigin], max_seq_length: SIMD[uint32, 1], max_cache_length: SIMD[uint32, 1])

`copy`

copy(self) -> Self

Explicitly construct a copy of self.

Returns:

A copy of this value.

`get_key_cache`

get_key_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[dtype_, kv_params_]

`get_value_cache`

get_value_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[dtype_, kv_params_]

`cache_length`

cache_length(self, bs_idx: Int) -> Int

Parameters​

Fields​

Implemented traits​

Aliases​

blocks_shape​

blocks_stride​

blocks_type​

CacheType​

dtype​

kv_params​

name_str​

Methods​

__init__​

copy​

get_key_cache​

get_value_cache​

cache_length​