Skip to main content
Log in

Mojo struct

ContinuousBatchingKVCacheCollection

struct ContinuousBatchingKVCacheCollection[type_: DType, kv_params_: KVCacheStaticParams]

This is a "view" of the cache for the given sequences in the batch.

This object does not own the underlying buffers in k_cache and v_cache, it's borrowing them from the BlockWrappers in our KVCacheManager. It does own the Pointer[NDBuffer[type, 3]] and valid_lengths buffer

Aliases

  • type = type_:
  • kv_params = kv_params_:
  • CacheType = ContinuousBatchingKVCache[type_, kv_params_]:
  • blocks_shape = DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)):
  • blocks_stride = _strides_from_shape[::DimList,::Int]():
  • blocks_type = NDBuffer[type_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]:

Fields

  • cache_lengths (NDBuffer[uint32, 1, MutableAnyOrigin]):
  • lookup_table (NDBuffer[uint32, 1, MutableAnyOrigin]):
  • blocks (NDBuffer[type_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]):
  • max_seq_length (SIMD[uint32, 1]):
  • max_cache_length (SIMD[uint32, 1]):
  • kv_cache_dynamic_shape (Index[4]):
  • kv_cache_dynamic_strides (Index[4]):

Implemented traits

AnyType, Copyable, KVCollectionT, Movable, UnknownDestructibility

Methods

__init__

__init__(out self, blocks: NDBuffer[type_, 6, MutableAnyOrigin], cache_lengths: NDBuffer[uint32, 1, MutableAnyOrigin], lookup_table: NDBuffer[uint32, 1, MutableAnyOrigin], max_seq_length: SIMD[uint32, 1], max_cache_length: SIMD[uint32, 1])

__copyinit__

__copyinit__(out self, other: Self)

__moveinit__

__moveinit__(out self, owned other: Self)

copy

copy(self) -> Self

Explicitly construct a copy of self.

Returns:

A copy of this value.

get_key_cache

get_key_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[type_, kv_params_]

get_value_cache

get_value_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[type_, kv_params_]

cache_length

cache_length(self, bs_idx: Int) -> Int

Was this page helpful?