Mojo function
find_K_alignment_upto_16B
find_K_alignment_upto_16B(row_bytes_arg: Int) -> Int
Find alignment among 1B, 2B, 4B, 16B based on the row's bytes.
This function determines the largest power-of-2 alignment (up to 16 bytes) that evenly divides the given row size. This is used to determine the optimal vector size for cp.async operations when K dimension alignment doesn't meet TMA requirements.
Args:
- row_bytes_arg (Int): Number of bytes in a row (K * sizeof(element)).
Returns:
Int: Alignment in bytes (1, 2, 4, 8, or 16).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
