Mojo module
simd
Implements SIMD primitives and abstractions.
Provides high-performance SIMD primitives and abstractions for vectorized computation in Mojo. It enables efficient data-parallel operations by leveraging hardware vector processing units across different architectures.
Key Features:
- Architecture-agnostic SIMD abstractions with automatic hardware detection
- Optimized vector operations for common numerical computations
- Explicit control over vectorization strategies and memory layouts
- Zero-cost abstractions that compile to efficient machine code
- Support for different vector widths and element types
Primary Components:
- Vector types: Strongly-typed vector containers with element-wise operations
- SIMD intrinsics: Low-level access to hardware SIMD instructions
- Vectorized algorithms: Common algorithms optimized for SIMD execution
- Memory utilities: Aligned memory allocation and vector load/store operations
Performance Considerations:
- Vector width selection should match target hardware capabilities
- Memory alignment affects load/store performance
- Data layout transformations may be necessary for optimal vectorization
Integration: This module is designed to work seamlessly with other Mojo numerical computing components, including tensor operations, linear algebra routines, and domain-specific libraries for machine learning and scientific computing.
Aliases
BFloat16
comptime BFloat16 = BFloat16
Represents a 16-bit brain floating point value.
Byte
comptime Byte = UInt8
Represents a byte (backed by an 8-bit unsigned integer).
Float16
comptime Float16 = Float16
Represents a 16-bit floating point value.
Float32
comptime Float32 = Float32
Represents a 32-bit floating point value.
Float4_e2m1fn
comptime Float4_e2m1fn = Float4_e2m1fn
Represents a 4-bit e2m1 floating point format, encoded as s.ee.m and defined by the Open Compute MX Format Specification:
- (s)ign: 1 bit
- (e)xponent: 2 bits
- (m)antissa: 1 bits
- exponent_bias: 1
Float64
comptime Float64 = Float64
Represents a 64-bit floating point value.
Float8_e4m3fn
comptime Float8_e4m3fn = Float8_e4m3fn
Represents the E4M3 floating point format defined in the OFP8 standard.
This type is named differently across libraries and vendors, for example:
- Mojo, PyTorch, JAX, and LLVM refer to it as
e4m3fn. - OCP, NVIDIA CUDA, and AMD ROCm refer to it as
e4m3.
In these contexts, they are all referring to the same finite type specified
in the OFP8 standard above, encoded as seeeemmm:
- (s)ign: 1 bit
- (e)xponent: 4 bits
- (m)antissa: 3 bits
- exponent bias: 7
- nan: 01111111, 11111111
- -0: 10000000
- fn: finite (no inf or -inf encodings)
Float8_e4m3fnuz
comptime Float8_e4m3fnuz = Float8_e4m3fnuz
Represents an 8-bit e4m3fnuz floating point format, encoded as seeeemmm: - (s)ign: 1 bit - (e)xponent: 4 bits - (m)antissa: 3 bits - exponent bias: 8 - nan: 10000000 - fn: finite (no inf or -inf encodings) - uz: unsigned zero (no -0 encoding)
Float8_e5m2
comptime Float8_e5m2 = Float8_e5m2
Represents the 8-bit E5M2 floating point format from the OFP8 standard, encoded as seeeeemm: - (s)ign: 1 bit - (e)xponent: 5 bits - (m)antissa: 2 bits - exponent bias: 15 - nan: {0,1}11111{01,10,11} - inf: 01111100 - -inf: 11111100 - -0: 10000000
Float8_e5m2fnuz
comptime Float8_e5m2fnuz = Float8_e5m2fnuz
Represents an 8-bit floating point format, encoded as seeeeemm: - (s)ign: 1 bit - (e)xponent: 5 bits - (m)antissa: 2 bits - exponent bias: 16 - nan: 10000000 - fn: finite (no inf or -inf encodings) - uz: unsigned zero (no -0 encoding)
Int128
comptime Int128 = Int128
Represents a 128-bit signed scalar integer.
Int16
comptime Int16 = Int16
Represents a 16-bit signed scalar integer.
Int256
comptime Int256 = Int256
Represents a 256-bit signed scalar integer.
Int32
comptime Int32 = Int32
Represents a 32-bit signed scalar integer.
Int64
comptime Int64 = Int64
Represents a 64-bit signed scalar integer.
Int8
comptime Int8 = Int8
Represents an 8-bit signed scalar integer.
Scalar
comptime Scalar = Scalar[?]
Represents a scalar dtype.
U8x16
comptime U8x16 = SIMD[DType.uint8, 16]
UInt128
comptime UInt128 = UInt128
Represents a 128-bit unsigned scalar integer.
UInt16
comptime UInt16 = UInt16
Represents a 16-bit unsigned scalar integer.
UInt256
comptime UInt256 = UInt256
Represents a 256-bit unsigned scalar integer.
UInt32
comptime UInt32 = UInt32
Represents a 32-bit unsigned scalar integer.
UInt64
comptime UInt64 = UInt64
Represents a 64-bit unsigned scalar integer.
UInt8
comptime UInt8 = UInt8
Represents an 8-bit unsigned scalar integer.
Structs
-
FastMathFlag: Flags for controlling fast-math optimizations in floating-point operations. -
SIMD: Represents a vector type that leverages hardware acceleration to process multiple data elements with a single operation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!