Mojo function
cvt_block_fp8_to_bf16_with_scale
cvt_block_fp8_to_bf16_with_scale[input_type: DType, output_dtype: DType, KRopeType: MHAOperand, //, swizzle_fp8: Swizzle, swizzle_bf16: Swizzle](input: TileTensor[input_type, input.LayoutType, input.origin, address_space=AddressSpace.SHARED, linear_idx_type=input.linear_idx_type, element_size=input.element_size], mut output: TileTensor[output_dtype, output.LayoutType, output.origin, address_space=AddressSpace.SHARED, linear_idx_type=output.linear_idx_type, element_size=output.element_size], k_rope_lut: KRopeType, seq_info: SeqInfo, kv_start_row: UInt32, num_keys: UInt32, tid: UInt32)
TileTensor overload โ standalone implementation using .ptr and comptime static_shape/static_stride directly.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!