IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

KVCacheStaticParams

struct KVCacheStaticParams

Fields​

  • ​num_heads (Int):
  • ​head_size (Int):
  • ​is_mla (Bool):

Implemented traits​

AnyType, Copyable, Equatable, ImplicitlyCopyable, ImplicitlyDeletable, Movable, RegisterPassable, TrivialRegisterPassable

Methods​

__init__​

def __init__(num_heads: Int, head_size: Int, is_mla: Bool = False) -> Self

Initialize KVCacheStaticParams. Args: num_heads (Int): Number of attention heads. head_size (Int): Size of each attention head. is_mla (Bool, optional): Whether to use Multi-Linear Attention (MLA) mode. If true, we only store k cache. If False, we store k and v cache. Defaults to False.