Mojo function
rope_split_store_paged_ragged
rope_split_store_paged_ragged[dtype: DType, freq_dtype: DType, target: StringSlice[StaticConstantOrigin] = "cpu", interleaved: Bool = True](qkv: TileTensor[dtype, qkv.LayoutType, qkv.origin, address_space=qkv.address_space, linear_idx_type=qkv.linear_idx_type, element_size=qkv.element_size], input_row_offsets: TileTensor[DType.uint32, input_row_offsets.LayoutType, input_row_offsets.origin, address_space=input_row_offsets.address_space, linear_idx_type=input_row_offsets.linear_idx_type, element_size=input_row_offsets.element_size], freqs_cis: TileTensor[freq_dtype, freqs_cis.LayoutType, freqs_cis.origin, address_space=freqs_cis.address_space, linear_idx_type=freqs_cis.linear_idx_type, element_size=freqs_cis.element_size], kv_collection: PagedKVCacheCollection[kv_collection.dtype_, kv_collection.kv_params_, kv_collection.page_size, kv_collection.scale_dtype_, kv_collection.quantization_granularity_], layer_idx: UInt32, q_output: TileTensor[dtype, q_output.LayoutType, q_output.origin, address_space=q_output.address_space, linear_idx_type=q_output.linear_idx_type, element_size=q_output.element_size], ctx: DeviceContextPtr)
Rope+split+store with paged KV cache collection.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!