Mojo module
id
This module provides GPU thread and block indexing functionality.
It defines aliases and functions for accessing GPU grid, block, thread and cluster dimensions and indices. These are essential primitives for GPU programming that allow code to determine its position and dimensions within the GPU execution hierarchy.
Most functionality is architecture-agnostic, with some NVIDIA-specific features clearly marked. The module is designed to work seamlessly across different GPU architectures while providing optimal performance through hardware-specific optimizations where applicable.
comptime values
block_dim
comptime block_dim = _BlockDim()
Contains the dimensions of the block as x, y, and z values.
For example: block_dim.y.
block_dim_int
comptime block_dim_int = _BlockDim()
Contains the dimensions of the block as x, y, and z values.
For example: block_dim.y.
block_dim_uint
comptime block_dim_uint = _BlockDim()
Contains the dimensions of the block as x, y, and z values.
For example: block_dim.y.
block_id_in_cluster
comptime block_id_in_cluster = _ClusterBlockIdx()
Contains the block id of the threadblock within a cluster, as x, y, and z values.
block_idx
comptime block_idx = _BlockIdx()
Contains the block index in the grid, as x, y, and z values.
block_idx_int
comptime block_idx_int = _BlockIdx()
Contains the block index in the grid, as x, y, and z values.
block_idx_uint
comptime block_idx_uint = _BlockIdx()
Contains the block index in the grid, as x, y, and z values.
cluster_dim
comptime cluster_dim = _ClusterDim()
Contains the dimensions of the cluster, as x, y, and z values.
cluster_idx
comptime cluster_idx = _ClusterIdx()
Contains the cluster index in the grid, as x, y, and z values.
global_idx
comptime global_idx = _GlobalIdx()
Contains the global offset of the kernel launch, as x, y, and z values.
global_idx_int
comptime global_idx_int = _GlobalIdx()
Contains the global offset of the kernel launch, as x, y, and z values.
global_idx_uint
comptime global_idx_uint = _GlobalIdx()
Contains the global offset of the kernel launch, as x, y, and z values.
grid_dim
comptime grid_dim = _GridDim()
Provides accessors for getting the x, y, and z dimensions of a grid.
grid_dim_int
comptime grid_dim_int = _GridDim()
Provides accessors for getting the x, y, and z dimensions of a grid.
grid_dim_uint
comptime grid_dim_uint = _GridDim()
Provides accessors for getting the x, y, and z dimensions of a grid.
thread_idx
comptime thread_idx = _ThreadIdx()
Contains the thread index in the block, as x, y, and z values.
Note: This accessor is in the process of migrating from UInt to Int values.
To continue using UInt thread index values, you may import the UInt-returning
alias:
from std.gpu import thread_idx_uint as thread_idxTo migrate to Int, instead import thread_idx_int and update uses to reflect
the change to Int:
from std.gpu import thread_idx_int as thread_idxThis thread_idx accessor will change to yielding Int values in a future
nightly.
thread_idx_int
comptime thread_idx_int = _ThreadIdx()
Contains the thread index in the block, as x, y, and z values.
thread_idx_uint
comptime thread_idx_uint = _ThreadIdx()
Contains the thread index in the block, as x, y, and z values.
Functions
-
lane_id: Returns the lane ID of the current thread within its warp. -
lane_id_int: Returns the lane ID of the current thread within its warp. -
lane_id_uint: Returns the lane ID of the current thread within its warp. -
sm_id: Returns the Streaming Multiprocessor (SM) ID of the current thread. -
warp_id: Returns the warp ID of the current thread within its block. The warp ID is a unique identifier for each warp within a block, ranging from 0 to BLOCK_SIZE/WARP_SIZE-1. This ID is commonly used for warp-level programming and synchronization within a block. -
warp_id_int: Returns the warp ID of the current thread within its block. -
warp_id_uint: Returns the warp ID of the current thread within its block.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!