Skip to main content
Log in

Mojo function

multimem_st

multimem_st[type: DType, *, count: Int, scope: Scope, consistency: Consistency, width: Int = 1](addr: UnsafePointer[SIMD[type, 1], address_space=AddressSpace(1)], values: StaticTuple[SIMD[type, width], count])

Stages an inline multimem.st instruction.

This operation performs a store to all memory locations pointed to by the multimem address using the specified memory consistency model and scope.

Notes:

  • Requires SM90+ GPU architecture (PTX ISA 8.1+).
  • The address must be a valid multimem address.
  • Supported type-width combinations must total 32/64/128 bits.
  • Default memory semantics: weak consistency (when not specified).
  • Vector stores (.v2/.v4) require matching total size constraints.

Example:

from gpu.memory import *

# Store 2 float32 values to multimem address.
multimem_st[DType.float32, count=2, scope=Scope.CTA, consistency=Consistency.RELAXED](
addr, StaticTuple[DType.float32, 2](val1, val2)
)

# Vector store of 4 float16x2 values.
multimem_st[DType.float16, count=4, scope=Scope.CLUSTER, consistency=Consistency.RELEASE, width=2](
addr, StaticTuple[DType.float16, 4](vec1, vec2, vec3, vec4)
)
from gpu.memory import *

# Store 2 float32 values to multimem address.
multimem_st[DType.float32, count=2, scope=Scope.CTA, consistency=Consistency.RELAXED](
addr, StaticTuple[DType.float32, 2](val1, val2)
)

# Vector store of 4 float16x2 values.
multimem_st[DType.float16, count=4, scope=Scope.CLUSTER, consistency=Consistency.RELEASE, width=2](
addr, StaticTuple[DType.float16, 4](vec1, vec2, vec3, vec4)
)

See Also: PTX ISA Documentation.

Parameters:

  • type (DType): The data type of elements to store (must be float16, bfloat16, or float32).
  • count (Int): Number of vector elements per store operation (2 or 4).
  • scope (Scope): Memory scope for visibility of the store operation (CTA/Cluster/GPU/System).
  • consistency (Consistency): Memory consistency semantics (weak/relaxed/release).
  • width (Int): Vector width modifier for packed data types (default 1).

Args:

  • addr (UnsafePointer[SIMD[type, 1], address_space=AddressSpace(1)]): Multimem address in global address space pointing to multiple locations.
  • values (StaticTuple[SIMD[type, width], count]): Packed SIMD values to store, with count matching the template parameter.