Mojo function
convert_e4m3fn_to_e4m3fnuz
convert_e4m3fn_to_e4m3fnuz(input_buffer: NDBuffer[float8_e4m3fn, 2, origin, shape, strides], output_buffer: NDBuffer[float8_e4m3fnuz, 2, origin, shape, strides], context: DeviceContext)
Convert E4M3FN weights to E4M3FNUZ format for AMD GPU compatibility.
This conversion handles the key differences between E4M3FN and E4M3FNUZ:
- The bit pattern 10000000 (-128) represents zero in E4M3FN but NaN in E4M3FNUZ
Args:
- input_buffer (
NDBuffer
): Input tensor in E4M3FN format. - output_buffer (
NDBuffer
): Output tensor to store E4M3FNUZ format. - context (
DeviceContext
): Device context for kernel execution.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!