Mojo module
mma_amd
AMD CDNA Matrix Cores implementation for matrix multiply-accumulate operations.
This module provides MMA implementations for AMD CDNA2, CDNA3, and CDNA4 data center GPUs using the MFMA (Matrix Fused Multiply-Add) instructions.
Reference: https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-matrix-cores-readme/
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!