Skip to main content
Log in

Mojo function

generic_flash_attention_kv_cache_continuous_batch

generic_flash_attention_kv_cache_continuous_batch[target: StringSlice[StaticConstantOrigin], type: DType](q: NDBuffer[type, 4, origin, shape, strides], kv_collection: ContinuousBatchingKVCacheCollection[type_, kv_params_], layer_idx: SIMD[uint32, 1], mask: NDBuffer[type, rank, origin, shape, strides], valid_lengths: NDBuffer[uint32, 1, origin], scale: SIMD[float32, 1], output: NDBuffer[type, 4, origin, shape, strides], context: DeviceContextPtr)

Was this page helpful?