Mojo module
mla_decode_dispatch
comptime valuesβ
loggerβ
comptime logger = Logger(stdout, prefix=String(""), source_location=False)
Structsβ
- β
MLADispatchScalarArgs: Pre-computed MLA decode args for the legacy (non-capturable) path.
Functionsβ
- β
compute_mla_dispatch_scalars: Pure computation of the packed 3-value MLA dispatch metadata. - β
compute_mla_dispatch_scalars_runtime: - β
launch_mla_sm100_decode_enqueue_kernel: - β
launch_mla_sm100_decode_fp8_per_token_scale_rope_aware: Launch the FP8 per-token-scale rope-aware MLA decode kernel with split content/rope TMAs. - β
launch_mla_sm100_decode_native_fp8: Launch the native FP8 MLA decode kernel with FP8 Q TMA. - β
launch_mla_sm100_decode_sparse: Launch the sparse MLA decode kernel with gather4 TMA descriptors. - β
launch_mla_sm100_decode_sparse_kv_fp8: Launches the all-FP8 sparse MLA decode kernel. - β
mla_decode_sm100_dispatch: - β
mla_decode_sm100_sink_split_k:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!