Skip to main content
Log in

Mojo module

simd

Implements SIMD primitives and abstractions.

Provides high-performance SIMD primitives and abstractions for vectorized computation in Mojo. It enables efficient data-parallel operations by leveraging hardware vector processing units across different architectures.

Key Features:

  1. Architecture-agnostic SIMD abstractions with automatic hardware detection
  2. Optimized vector operations for common numerical computations
  3. Explicit control over vectorization strategies and memory layouts
  4. Zero-cost abstractions that compile to efficient machine code
  5. Support for different vector widths and element types

Primary Components:

  • Vector types: Strongly-typed vector containers with element-wise operations
  • SIMD intrinsics: Low-level access to hardware SIMD instructions
  • Vectorized algorithms: Common algorithms optimized for SIMD execution
  • Memory utilities: Aligned memory allocation and vector load/store operations

Performance Considerations:

  • Vector width selection should match target hardware capabilities
  • Memory alignment affects load/store performance
  • Data layout transformations may be necessary for optimal vectorization

Integration: This module is designed to work seamlessly with other Mojo numerical computing components, including tensor operations, linear algebra routines, and domain-specific libraries for machine learning and scientific computing.

Aliases

  • BFloat16 = SIMD[bfloat16, 1]: Represents a 16-bit brain floating point value.

  • Byte = SIMD[uint8, 1]: Represents a byte (backed by an 8-bit unsigned integer).

  • Float16 = SIMD[float16, 1]: Represents a 16-bit floating point value.

  • Float32 = SIMD[float32, 1]: Represents a 32-bit floating point value.

  • Float64 = SIMD[float64, 1]: Represents a 64-bit floating point value.

  • Float8_e4m3fn = SIMD[float8_e4m3fn, 1]: Represents the E4M3 floating point format defined in the OFP8 standard. This type is named differently across libraries and vendors, for example:

    • Mojo, PyTorch, JAX, and LLVM refer to it as e4m3fn.
    • OCP, NVIDIA CUDA, and AMD ROCm refer to it as e4m3.

    In these contexts, they are all referring to the same finite type specified in the OFP8 standard above, encoded as seeeemmm:

    • (s)ign: 1 bit
    • (e)xponent: 4 bits
    • (m)antissa: 3 bits
    • exponent bias: 7
    • nan: 01111111, 11111111
    • -0: 10000000
    • fn: finite (no inf or -inf encodings)
  • Float8_e4m3fnuz = SIMD[float8_e4m3fnuz, 1]: Represents an 8-bit e4m3fnuz floating point format, encoded as seeeemmm: - (s)ign: 1 bit - (e)xponent: 4 bits - (m)antissa: 3 bits - exponent bias: 8 - nan: 10000000 - fn: finite (no inf or -inf encodings) - uz: unsigned zero (no -0 encoding)

  • Float8_e5m2 = SIMD[float8_e5m2, 1]: Represents the 8-bit E5M2 floating point format from the OFP8 standard, encoded as seeeeemm: - (s)ign: 1 bit - (e)xponent: 5 bits - (m)antissa: 2 bits - exponent bias: 15 - nan: {0,1}11111{01,10,11} - inf: 01111100 - -inf: 11111100 - -0: 10000000

  • Float8_e5m2fnuz = SIMD[float8_e5m2fnuz, 1]: Represents an 8-bit floating point format, encoded as seeeeemm: - (s)ign: 1 bit - (e)xponent: 5 bits - (m)antissa: 2 bits - exponent bias: 16 - nan: 10000000 - fn: finite (no inf or -inf encodings) - uz: unsigned zero (no -0 encoding)

  • Int128 = SIMD[si128, 1]: Represents a 128-bit signed scalar integer.

  • Int16 = SIMD[int16, 1]: Represents a 16-bit signed scalar integer.

  • Int256 = SIMD[si256, 1]: Represents a 256-bit signed scalar integer.

  • Int32 = SIMD[int32, 1]: Represents a 32-bit signed scalar integer.

  • Int64 = SIMD[int64, 1]: Represents a 64-bit signed scalar integer.

  • Int8 = SIMD[int8, 1]: Represents an 8-bit signed scalar integer.

  • Scalar = SIMD[?, 1]: Represents a scalar dtype.

  • UInt128 = SIMD[ui128, 1]: Represents a 128-bit unsigned scalar integer.

  • UInt16 = SIMD[uint16, 1]: Represents a 16-bit unsigned scalar integer.

  • UInt256 = SIMD[ui256, 1]: Represents a 256-bit unsigned scalar integer.

  • UInt32 = SIMD[uint32, 1]: Represents a 32-bit unsigned scalar integer.

  • UInt64 = SIMD[uint64, 1]: Represents a 64-bit unsigned scalar integer.

  • UInt8 = SIMD[uint8, 1]: Represents an 8-bit unsigned scalar integer.

Structs

  • SIMD: Represents a small vector that is backed by a hardware vector element.