For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
KVPipelineGeneric
struct KVPipelineGeneric[num_kv_stages: Int, num_qk_stages: Int, num_producer: Int, num_consumer: Int]
KVPipeline has num_kv_stages * num_qk_stages stages. num_kv_stages refers to how many K and V tiles we pipeline for performing the S = Q@K' and O += P@V MMAs. Each of these MMAs is broken up into num_qk_stages pipelined MMAs. We set step=False for all but the last MMA that completes the operation. An alternative implementation would separate the two, and potentially allow for more overall stages at the cost of slightly more bookkeeping.
Fieldsβ
- βmbar (
MBarType): - βstate (
PipelineState[num_kv_stages]):
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDeletable,
Movable,
RegisterPassable,
TrivialRegisterPassable
comptime membersβ
num_stagesβ
comptime num_stages = (num_kv_stages * num_qk_stages)
Methodsβ
__init__β
def __init__(mbar: UnsafePointer[SharedMemBarrier, MutAnyOrigin, address_space=AddressSpace.SHARED]) -> Self
initβ
def init(self)
producer_mbarβ
def producer_mbar[qk_stage: Int](self) -> MBarType
Returns:
MBarType
consumer_mbarβ
def consumer_mbar[qk_stage: Int](self, idx: UInt32) -> MBarType
Returns:
MBarType
def consumer_mbar[qk_stage: Int](self) -> MBarType
Returns:
MBarType
producer_acquireβ
def producer_acquire[qk_stage: Int = (num_qk_stages - 1)](self)
Returns the dynamic pipe idx.
consumer_waitβ
def consumer_wait[qk_stage: Int = (num_qk_stages - 1)](self)
consumer_releaseβ
def consumer_release[qk_stage: Int = (num_qk_stages - 1)](mut self, e: Int32)
num_mbarsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!