Mojo function
st_matrix_m_atom
st_matrix_m_atom[num_stmatrix: Int, num_consumer: Int]() -> Layout
Creates a layout for M-major st_matrix atom in the context of WGMMA C matrix.
The domain of this layout is the warp group local thread index. Thus, the layout takes [0, 128) as input and returns an offset for a logical array with an element size of 128-bit.
Assume num_consumer = 2, and num_stmatrix = 2 then a single atom for one warp would look like this Each block contains the thread_idx, each thread idx will hold the address of the next 128-bit fragment.
| 0 | 8 | | 1 | 9 | | 2 | 10 | | ... | ... | | 7 | 15 |
| 16 | 24 | | 17 | 25 | | 18 | 26 | | ... | ... | | 23 | 31 |
All 4 warps in the warp group will then be laid out next to each other
| w1 | w2 | w3 | w4 |
Parameters:
- num_stmatrix (
Int): Number of N-dimension tiles in the C matrix. - num_consumer (
Int): Number of consumers.
Returns:
Layout: Layout - A layout that maps warp group local thread index to an offset
for a logical array with an element size of 128-bit.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!