Mojo module
grouped_matmul_block_scaled_dispatch
General dispatch for grouped block-scaled matmul.
Routes to format-specific grouped matmul implementations based on the input dtype and target GPU architecture. Currently supports NVFP4 on SM100; MXFP4 and MXFP8 will be added in subsequent commits.
Functions
-
grouped_matmul_block_scaled_dispatch: Dispatch grouped block-scaled matmul to format-specific implementation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!