For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo module
attention
comptime valuesβ
loggerβ
comptime logger = Logger(stdout, prefix=String(""), source_location=False)
Structsβ
- β
FlashAttentionGPU: - β
MaskedFlashAttentionGPU: - β
MLAIndexerRaggedFloat8Paged: - β
NoMaskFlashAttentionCPU: - β
PaddedFlashAttentionGPU: - β
RaggedFlashAttentionGPU: - β
Struct_cross_attention_ragged_paged: - β
Struct_fused_qk_rope_padded_paged: - β
Struct_fused_qk_rope_ragged_paged: - β
Struct_fused_qk_rope_ragged_paged_with_position_id: - β
Struct_fused_qkv_matmul_padded_paged: - β
Struct_fused_qkv_matmul_padded_ragged: - β
Struct_fused_qkv_matmul_padded_ragged_bias: - β
Struct_fused_qkv_matmul_padded_ragged_bias_quantized: - β
Struct_fused_qkv_matmul_padded_ragged_quantized: - β
Struct_fused_qkv_matmul_padded_ragged_scale: - β
Struct_fused_qkv_matmul_padded_ragged_scale_bias: - β
Struct_fused_qkv_matmul_padded_ragged_scale_float4: - β
Struct_mha_decode_num_partitions: - β
Struct_mha_padded_paged: - β
Struct_mha_ragged_paged_fp8_kv: MHA with bf16 Q and fp8_e4m3fn paged KV cache (dequant-staging path). - β
Struct_mha_ragged_paged_scalar_args: - β
Struct_mha_ragged_paged_sink_weights_scalar_args: - β
Struct_mla_compute_dispatch_args_scalar: - β
Struct_mla_decode_graph_bf16_paged: - β
Struct_mla_decode_graph_paged_fp8: - β
Struct_mla_decode_graph_paged_fp8_sparse: - β
Struct_mla_decode_ragged_paged: - β
Struct_mla_decode_ragged_paged_scaled: - β
Struct_mla_decompress_k_cache_ragged_paged: - β
Struct_mla_prefill_graph_bf16_paged: - β
Struct_mla_prefill_graph_decode_bf16_paged: - β
Struct_mla_prefill_graph_decode_bf16_paged_quantized: - β
Struct_mla_prefill_graph_decode_paged_fp8: - β
Struct_mla_prefill_graph_decode_paged_fp8_sparse: - β
Struct_mla_prefill_graph_paged: - β
Struct_mla_prefill_ragged_paged: - β
Struct_mla_prefill_ragged_plan: - β
Struct_mla_prefill_sparse_paged: - β
Struct_mla_prefill_sparse_paged_fp8: - β
WithMaskFlashAttentionCPU: - β
WithMaskFlashAttentionSplitKVCPU:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!