For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
MXFP4TokenFormat
struct MXFP4TokenFormat[fp4_dtype: DType, scales_dtype: DType, output_layout: TensorLayout, scales_layout: TensorLayout, //, _hid_dim: Int, _top_k: Int, _alignment: Int = 0]
Fieldsβ
- βoutput_tokens (
MXFP4TokenFormat[_hid_dim, _top_k, _alignment].TensorType): - βoutput_scales (
MXFP4TokenFormat[_hid_dim, _top_k, _alignment].ScalesTensorType):
Implemented traitsβ
AnyType,
Copyable,
DevicePassable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable,
RegisterPassable,
TokenFormat,
TrivialRegisterPassable
comptime membersβ
alignmentβ
comptime alignment = _alignment if _alignment.__bool__() else get_device_alignment()
device_typeβ
comptime device_type = MXFP4TokenFormat[_hid_dim, _top_k, _alignment]
dispatch_smem_sizeβ
comptime dispatch_smem_size = 0
dispatch_wait_tile_shapeβ
comptime dispatch_wait_tile_shape = Tuple(128, 1)
group_sizeβ
comptime group_size = MXFP4_SF_VECTOR_SIZE
hid_dimβ
comptime hid_dim = _hid_dim
ScalesTensorTypeβ
comptime ScalesTensorType = TileTensor[scales_dtype, scales_layout, MutExternalOrigin]
TensorTypeβ
comptime TensorType = TileTensor[fp4_dtype, output_layout, MutExternalOrigin]
top_kβ
comptime top_k = _top_k
Methodsβ
__init__β
__init__(output_tokens: TileTensor[fp4_dtype, output_layout, address_space=output_tokens.address_space, linear_idx_type=output_tokens.linear_idx_type, element_size=output_tokens.element_size], output_scales: TileTensor[scales_dtype, scales_layout, address_space=output_scales.address_space, linear_idx_type=output_scales.linear_idx_type, element_size=output_scales.element_size]) -> Self
get_type_nameβ
fp4_quant_sizeβ
scales_sizeβ
token_sizeβ
scales_offsetβ
copy_token_to_send_bufβ
static copy_token_to_send_buf[src_type: DType, block_size: Int, buf_addr_space: AddressSpace = AddressSpace.GENERIC](buf_p: UnsafePointer[UInt8, address_space=buf_addr_space], src_p: UnsafePointer[Scalar[src_type], address_space=src_p.address_space], input_scale: Float32)
copy_msg_to_output_tensorβ
copy_msg_to_output_tensor[buf_addr_space: AddressSpace = AddressSpace.GENERIC](self, buf_p: UnsafePointer[UInt8, address_space=buf_addr_space], token_index: Int)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!