For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

rms_norm_value_cache_ragged_paged

def rms_norm_value_cache_ragged_paged[dtype: DType, params: KVCacheStaticParams, page_size: Int, cache_dtype: DType, //, target: StringSlice[StaticConstantOrigin], multiply_before_cast: Bool, per_head_norm: Bool](kv_collection: PagedKVCacheCollection[cache_dtype, params, page_size, scale_dtype_=kv_collection.scale_dtype_, quantization_granularity_=kv_collection.quantization_granularity_], gamma: TileTensor[dtype, Storage=gamma.Storage, address_space=gamma.address_space, linear_idx_type=gamma.linear_idx_type, element_size=gamma.element_size], epsilon: Scalar[dtype], weight_offset: Scalar[dtype], layer_idx: UInt32, total_seq_len: UInt32, input_row_offsets: TileTensor[DType.uint32, Storage=input_row_offsets.Storage, address_space=input_row_offsets.address_space, linear_idx_type=input_row_offsets.linear_idx_type, element_size=input_row_offsets.element_size], context: DeviceContext)

Performs RMSNorm in place on new entries in the value cache.

Same indexing and layout as rms_norm_kv_cache_ragged_paged on the key cache, but reads/writes the value cache tensor for layer_idx.