Skip to main content

Mojo function

generic_get_paged_cache

generic_get_paged_cache[dtype: DType](blocks: ManagedTensorSlice[MutableInput, static_spec=blocks.static_spec], cache_lengths: ManagedTensorSlice[Input, static_spec=cache_lengths.static_spec], lookup_table: ManagedTensorSlice[Input, static_spec=lookup_table.static_spec], max_lengths: ManagedTensorSlice[Input, static_spec=max_lengths.static_spec], out result: PagedKVCacheCollection[dtype, KVCacheStaticParams(SIMD(StaticTensorSpec[dtype, 6, blocks.static_spec.static_layout, blocks.static_spec.InFusion, blocks.static_spec.OutFusion, blocks.static_spec.ComputeFusion].shape.get[4]()), SIMD(StaticTensorSpec[dtype, 6, blocks.static_spec.static_layout, blocks.static_spec.InFusion, blocks.static_spec.OutFusion, blocks.static_spec.ComputeFusion].shape.get[5]()), (StaticTensorSpec[dtype, 6, blocks.static_spec.static_layout, blocks.static_spec.InFusion, blocks.static_spec.OutFusion, blocks.static_spec.ComputeFusion].shape.get[1]() == 1)), StaticTensorSpec[dtype, 6, blocks.static_spec.static_layout, blocks.static_spec.InFusion, blocks.static_spec.OutFusion, blocks.static_spec.ComputeFusion].shape.get[3]()])

Returns:

PagedKVCacheCollection

generic_get_paged_cache[dtype: DType, kv_params: KVCacheStaticParams, page_size: Int](blocks: LayoutTensor[dtype, Layout.row_major[6](), blocks.origin], cache_lengths: LayoutTensor[DType.uint32, Layout(IntTuple(-1)), cache_lengths.origin], lookup_table: LayoutTensor[DType.uint32, Layout.row_major[2](), lookup_table.origin], max_lengths: LayoutTensor[DType.uint32, Layout.row_major[2](), max_lengths.origin], out result: PagedKVCacheCollection[dtype, kv_params, page_size])

Returns:

PagedKVCacheCollection

Was this page helpful?