Mojo function
convert_e4m3fn_to_e4m3fnuz
convert_e4m3fn_to_e4m3fnuz(input_buffer: TileTensor[DType.float8_e4m3fn, input_buffer.LayoutType, input_buffer.origin, address_space=input_buffer.address_space, linear_idx_type=input_buffer.linear_idx_type, element_size=input_buffer.element_size], output_buffer: TileTensor[DType.float8_e4m3fnuz, output_buffer.LayoutType, output_buffer.origin, address_space=output_buffer.address_space, linear_idx_type=output_buffer.linear_idx_type, element_size=output_buffer.element_size], context: DeviceContext)
Convert E4M3FN weights to E4M3FNUZ format for AMD GPU compatibility.
This conversion handles the key differences between E4M3FN and E4M3FNUZ:
- The bit pattern 10000000 (-128) represents zero in E4M3FN but NaN in E4M3FNUZ
Args:
- โinput_buffer (
TileTensor): Input tensor in E4M3FN format. - โoutput_buffer (
TileTensor): Output tensor to store E4M3FNUZ format. - โcontext (
DeviceContext): Device context for kernel execution.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!