Mojo struct
Attribute
@register_passable(trivial)
struct Attribute
Represents GPU kernel function attributes.
This struct defines constants for various function attributes that can be queried or set for GPU kernels. These attributes provide information about resource requirements and execution constraints of kernel functions.
Aliases
MAX_THREADS_PER_BLOCK = Attribute(__init__[__mlir_type.!pop.int_literal](0))
: The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.SHARED_SIZE_BYTES = Attribute(__init__[__mlir_type.!pop.int_literal](1))
: The size in bytes of statically-allocated shared memory required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.CONST_SIZE_BYTES = Attribute(__init__[__mlir_type.!pop.int_literal](2))
: The size in bytes of user-allocated constant memory required by this function.LOCAL_SIZE_BYTES = Attribute(__init__[__mlir_type.!pop.int_literal](3))
: The size in bytes of local memory used by each thread of this function.NUM_REGS = Attribute(__init__[__mlir_type.!pop.int_literal](4))
: The number of registers used by each thread of this function.PTX_VERSION = Attribute(__init__[__mlir_type.!pop.int_literal](5))
: The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0..BINARY_VERSION = Attribute(__init__[__mlir_type.!pop.int_literal](6))
: The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly- encoded binary architecture version..CACHE_MODE_CA = Attribute(__init__[__mlir_type.!pop.int_literal](7))
: The attribute to indicate whether the function has been compiled with user specified option "-Xptxas --dlcm=ca" set .MAX_DYNAMIC_SHARED_SIZE_BYTES = Attribute(__init__[__mlir_type.!pop.int_literal](8))
: The maximum size in bytes of dynamically-allocated shared memory that can be used by this function. If the user-specified dynamic shared memory size is larger than this value.PREFERRED_SHARED_MEMORY_CARVEOUT = Attribute(__init__[__mlir_type.!pop.int_literal](9))
: On devices where the L1 cache and shared memory use the same hardware resources, this sets the shared memory carveout preference, in percent of the total shared memory.CLUSTER_SIZE_MUST_BE_SET = Attribute(__init__[__mlir_type.!pop.int_literal](10))
: If this attribute is set, the kernel must launch with a valid cluster size specified.REQUIRED_CLUSTER_WIDTH = Attribute(__init__[__mlir_type.!pop.int_literal](11))
: The required cluster width in blocks. The values must either all be 0 or all be positive. The validity of the cluster dimensions is otherwise checked at launch time.REQUIRED_CLUSTER_HEIGHT = Attribute(__init__[__mlir_type.!pop.int_literal](12))
: The required cluster height in blocks. The values must either all be 0 or all be positive. The validity of the cluster dimensions is otherwise checked at launch time.REQUIRED_CLUSTER_DEPTH = Attribute(__init__[__mlir_type.!pop.int_literal](13))
: The required cluster depth in blocks. The values must either all be 0 or all be positive. The validity of the cluster dimensions is otherwise checked at launch time.NON_PORTABLE_CLUSTER_SIZE_ALLOWED = Attribute(__init__[__mlir_type.!pop.int_literal](14))
: Whether the function can be launched with non-portable cluster size. 1 is allowed, 0 is disallowed. A non-portable cluster size may only function on the specific SKUs the program is tested on. The launch might fail if the program is run on a different hardware platform.CUDA API provides cudaOccupancyMaxActiveClusters to assist with checking whether the desired size can be launched on the current device.Portable Cluster SizeA portable cluster size is guaranteed to be functional on all compute capabilities higher than the target compute capability. The portable cluster size for sm_90 is 8 blocks per cluster.CLUSTER_SCHEDULING_POLICY_PREFERENCE = Attribute(__init__[__mlir_type.!pop.int_literal](15))
: The block scheduling policy of a function. The value type is CUclusterSchedulingPolicy / cudaClusterSchedulingPolicy.
Fields
- code (
SIMD[int32, 1]
): The numeric code representing the attribute type.
Implemented traits
AnyType
,
Copyable
,
ExplicitlyCopyable
,
Movable
,
UnknownDestructibility
,
Writable
Methods
__eq__
__eq__(self, other: Self) -> Bool
Checks if two Attribute instances are equal.
Args:
- other (
Self
): The Attribute to compare with.
Returns:
True if both attributes have the same code, False otherwise.
__ne__
__ne__(self, other: Self) -> Bool
Checks if two Attribute instances are not equal.
Args:
- other (
Self
): The Attribute to compare with.
Returns:
True if the attributes have different codes, False otherwise.
__is__
__is__(self, other: Self) -> Bool
Identity comparison operator for Attribute instances.
Args:
- other (
Self
): The Attribute to compare with.
Returns:
True if both attributes are identical (have the same code), False otherwise.
__isnot__
__isnot__(self, other: Self) -> Bool
Negative identity comparison operator for Attribute instances.
Args:
- other (
Self
): The Attribute to compare with.
Returns:
True if the attributes are not identical, False otherwise.
write_to
write_to[W: Writer](self, mut writer: W)
Writes a string representation of the Attribute
to the provided writer.
This method converts the `Attribute` enum value to its corresponding string name
and writes it to the provided writer object.
This method converts the `Attribute` enum value to its corresponding string name
and writes it to the provided writer object.
Parameters:
- W (
Writer
): The type of writer to use for output. Must implement the Writer trait.
Args:
- writer (
W
): A Writer object that will receive the string representation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!