Skip to main content

Mojo function

convert_e4m3fn_to_e4m3fnuz

convert_e4m3fn_to_e4m3fnuz(input_buffer: LayoutTensor[DType.float8_e4m3fn, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], output_buffer: LayoutTensor[DType.float8_e4m3fnuz, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], context: DeviceContext)

Convert E4M3FN weights to E4M3FNUZ format for AMD GPU compatibility.

This conversion handles the key differences between E4M3FN and E4M3FNUZ:

  1. The bit pattern 10000000 (-128) represents zero in E4M3FN but NaN in E4M3FNUZ

Args:

  • โ€‹input_buffer (LayoutTensor): Input tensor in E4M3FN format.
  • โ€‹output_buffer (LayoutTensor): Output tensor to store E4M3FNUZ format.
  • โ€‹context (DeviceContext): Device context for kernel execution.

Was this page helpful?