Mojo module
intrinsics
This module includes NVIDIA GPUs intrinsics operations.
Structs
Functions
-
byte_permute
: Return selected bytes from two 32-bit unsigned integers. -
ldg
: Load a register variable from global state space via non-coherent cache. -
load_acquire
: -
load_volatile
: -
lop
: Performs arbitrary logical operation on 3 inputs. -
mulhi
: Calculate the most significant 32 bits of the product of the two UInt16s. -
mulwide
: Calculate the most significant 32 bits of the product of the two UInts. -
store_release
: -
store_volatile
: -
threadfence
: Memory fence functions can be used to enforce some ordering on memory accesses. -
warpgroup_reg_alloc
: Provides a hint to the system to update the maximum number of per-thread registers owned by the executing warp to the value specified by the imm-reg-count operand. -
warpgroup_reg_dealloc
: Provides a hint to the system to update the maximum number of per-thread registers owned by the executing warp to the value specified by the imm-reg-count operand.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!