Mojo module
dispatch
Dispatch for depth=512 pair-CTA SM100 (Blackwell) MHA prefill.
Creates the Depth512SM100Config, TMA tile descriptors, and launches the pair-CTA kernel with cluster_dim=(2,1,1). The TransientScheduler uses pair_cta=True so that both CTAs in a cluster derive the same tile index from block_idx.x >> 1.
comptime valuesβ
loggerβ
comptime logger = Logger(stdout, prefix=String(""), source_location=False)
Functionsβ
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!