For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo module
dispatch_fused_bias_residual
Dispatcher for fused matmul + bias/residual (mo.composite.matmul_add).
Kept separate from matmul_dispatch_sm100 so the generic matmul dispatch does
not have to thread an epilogue tensor through every kernel. The bias/residual is
honored one of two mutually exclusive ways:
- Native: the SM100 blackwell GEMM applies it via its TMA epilogue load (the tensor is passed, no lambda). Used for GEMM-shaped problems (M>1, N>1, bf16, 16B-aligned N/K) -- the fast prefill path.
- Fallback: the bias/residual is wrapped as a normal elementwise (store)
epilogue and the generic
matmul_dispatch_sm100is reused. GEMV (M=1/N=1), small-MN, and vendor (cuBLAS) all already apply an elementwise epilogue, so correctness is universal. No tensor is passed, so there is no double-add.
comptime valuesβ
loggerβ
comptime logger = Logger(stdout, prefix=String(""), source_location=False)
Functionsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!