Mojo struct
FragmentCoords
@register_passable(trivial)
struct FragmentCoords[stageN: Int, repeats: Int]
Compute coordinates for fragment elements in tensor memory layout.
Based on tcgen05 matrix fragment layout (16x256b): https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-matrix-fragments-shape-16256b
Template Parameters: stageN: Stage width in elements. repeats: Number of repetitions for wider loads.
Fields
- top_upper (
StaticTuple[UInt32, 2]): - bottom_upper (
StaticTuple[UInt32, 2]): - top_lower (
StaticTuple[UInt32, 2]): - bottom_lower (
StaticTuple[UInt32, 2]):
Implemented traits
AnyType,
Copyable,
ImplicitlyCopyable,
Movable,
UnknownDestructibility
comptime members
__copyinit__is_trivial
comptime __copyinit__is_trivial = True
__del__is_trivial
comptime __del__is_trivial = True
__moveinit__is_trivial
comptime __moveinit__is_trivial = True
load_width
comptime load_width = 2
threads_per_row
comptime threads_per_row = ((stageN // repeats) // 2)
Methods
__init__
__init__(lane_id: UInt32) -> Self
Initialize fragment coordinates based on lane ID.
Args:
- lane_id (
UInt32): Lane ID within the warp.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!