Skip to main content

Mojo function

convert_e4m3fn_to_e4m3fnuz

convert_e4m3fn_to_e4m3fnuz(input_buffer: TileTensor[DType.float8_e4m3fn, input_buffer.LayoutType, input_buffer.origin, address_space=input_buffer.address_space, linear_idx_type=input_buffer.linear_idx_type, element_size=input_buffer.element_size], output_buffer: TileTensor[DType.float8_e4m3fnuz, output_buffer.LayoutType, output_buffer.origin, address_space=output_buffer.address_space, linear_idx_type=output_buffer.linear_idx_type, element_size=output_buffer.element_size], context: DeviceContext)

Convert E4M3FN weights to E4M3FNUZ format for AMD GPU compatibility.

This conversion handles the key differences between E4M3FN and E4M3FNUZ:

  1. The bit pattern 10000000 (-128) represents zero in E4M3FN but NaN in E4M3FNUZ

Args:

  • โ€‹input_buffer (TileTensor): Input tensor in E4M3FN format.
  • โ€‹output_buffer (TileTensor): Output tensor to store E4M3FNUZ format.
  • โ€‹context (DeviceContext): Device context for kernel execution.

Was this page helpful?