Skip to main content
Log in

Mojo module

intrinsics

This module includes NVIDIA GPUs intrinsics operations.

Structs

Functions

  • byte_permute: Return selected bytes from two 32-bit unsigned integers.
  • ldg: Load a register variable from global state space via non-coherent cache.
  • load_acquire:
  • load_volatile:
  • lop: Performs arbitrary logical operation on 3 inputs.
  • mulhi: Calculate the most significant 32 bits of the product of the two UInt16s.
  • mulwide: Calculate the most significant 32 bits of the product of the two UInts.
  • store_release:
  • store_volatile:
  • threadfence: Memory fence functions can be used to enforce some ordering on memory accesses.
  • warpgroup_reg_alloc: Provides a hint to the system to update the maximum number of per-thread registers owned by the executing warp to the value specified by the imm-reg-count operand.
  • warpgroup_reg_dealloc: Provides a hint to the system to update the maximum number of per-thread registers owned by the executing warp to the value specified by the imm-reg-count operand.