Mojo struct
ContinuousBatchingKVCacheCollection
struct ContinuousBatchingKVCacheCollection[dtype_: DType, kv_params_: KVCacheStaticParams]
This is a "view" of the cache for the given sequences in the batch.
This object does not own the underlying buffers in k_cache and v_cache, it's borrowing them from the BlockWrappers in our KVCacheManager. It does own the Pointer[NDBuffer[dtype, 3]] and valid_lengths buffer
Parameters
- dtype_ (
DType
): The dtype of the kv-cache. - kv_params_ (
KVCacheStaticParams
): The kv-cache static parameters.
Fields
- cache_lengths (
NDBuffer[uint32, 1, MutableAnyOrigin]
): - lookup_table (
NDBuffer[uint32, 1, MutableAnyOrigin]
): - blocks (
NDBuffer[dtype_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]
): - max_seq_length (
SIMD[uint32, 1]
): - max_cache_length (
SIMD[uint32, 1]
): - kv_cache_dynamic_shape (
IndexList[4]
): - kv_cache_dynamic_strides (
IndexList[4]
):
Implemented traits
AnyType
,
Copyable
,
KVCollectionT
,
Movable
,
UnknownDestructibility
Aliases
blocks_shape
alias blocks_shape = DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size))
blocks_stride
alias blocks_stride = _strides_from_shape[::DimList,::Int]()
blocks_type
alias blocks_type = NDBuffer[dtype_, 6, MutableAnyOrigin, DimList(Dim(-31337), Dim(-31337), Dim(-31337), Dim(-31337), Dim(kv_params_.num_heads), Dim(kv_params_.head_size)), _strides_from_shape[::DimList,::Int]()]
CacheType
alias CacheType = ContinuousBatchingKVCache[dtype_, kv_params_]
dtype
alias dtype = dtype_
kv_params
alias kv_params = kv_params_
name_str
alias name_str = "continuous_batching"
Methods
__init__
__init__(out self, blocks: NDBuffer[dtype_, 6, MutableAnyOrigin], cache_lengths: NDBuffer[uint32, 1, MutableAnyOrigin], lookup_table: NDBuffer[uint32, 1, MutableAnyOrigin], max_seq_length: SIMD[uint32, 1], max_cache_length: SIMD[uint32, 1])
copy
copy(self) -> Self
Explicitly construct a copy of self.
Returns:
A copy of this value.
get_key_cache
get_key_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[dtype_, kv_params_]
get_value_cache
get_value_cache(self, layer_idx: Int) -> ContinuousBatchingKVCache[dtype_, kv_params_]
cache_length
cache_length(self, bs_idx: Int) -> Int
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!