Skip to main content

Mojo package

kv_cache

Contains implementations for several types of key-value caches.

KV caches are used in transformer models to store key-value tensors output from self-attention layers.

These APIs are used in the higher-level functions in the nn package.

Modulesโ€‹

  • โ€‹lmcache_transfer: KV Cache transfer kernels for external cache integration (e.g., LMCache).
  • โ€‹types: This module contains the types for the key-value cache APIs.

Was this page helpful?