Mojo function
quantize_dynamic_scaled_fp8
quantize_dynamic_scaled_fp8[out_dtype: DType, in_dtype: DType, scales_dtype: DType, //, input_fn: fn[width: Int](row: Int, col: Int) capturing -> SIMD[in_dtype, width], group_size_or_per_token: Int, num_cols: Int](scaled_output: NDBuffer[out_dtype, 2, MutAnyOrigin], scales: NDBuffer[scales_dtype, 2, MutAnyOrigin], scale_ub: Float32, ctx: DeviceContext, num_rows: Int)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!