intrinsics

Module

Defines intrinsics.

You can import these APIs from the complex package. For example:

from sys.intrinsics import PrefetchLocality

PrefetchLocality

The prefetch locality.

The locality, rw, and cache type correspond to LLVM prefetch intrinsic’s inputs (see LLVM prefetch locality)

Aliases:

  • NONE = __init__(0): No locality.
  • LOW = __init__(1): Low locality.
  • MEDIUM = __init__(2): Medium locality.
  • HIGH = __init__(3): Extremely local locality (keep in cache).

Fields:

  • value (SIMD[si32, 1]): The prefetch locality to use. It should be a value in [0, 3].

Functions:

__init__

__init__(value: Int) -> Self

Constructs a prefetch locality option.

Args:

  • value (Int): An integer value representing the locality. Should be a value in the range [0, 3].

Returns:

The prefetch locality constructed.

PrefetchRW

Prefetch read or write.

Aliases:

  • READ = __init__(0): Read prefetch.
  • WRITE = __init__(1): Write prefetch.

Fields:

  • value (SIMD[si32, 1]): The read-write prefetch. It should be in [0, 1].

Functions:

__init__

__init__(value: Int) -> Self

Constructs a prefetch read-write option.

Args:

  • value (Int): An integer value representing the prefetch read-write option to be used. Should be a value in the range [0, 1].

Returns:

The prefetch read-write option constructed.

PrefetchCache

Prefetch cache type.

Aliases:

  • INSTRUCTION = __init__(0): The instruction prefetching option.
  • DATA = __init__(1): The data prefetching option.

Fields:

  • value (SIMD[si32, 1]): The cache prefetch. It should be in [0, 1].

Functions:

__init__

__init__(value: Int) -> Self

Constructs a prefetch option.

Args:

  • value (Int): An integer value representing the prefetch cache option to be used. Should be a value in the range [0, 1].

Returns:

The prefetch cache type that was constructed.

PrefetchOptions

Collection of configuration parameters for a prefetch intrinsic call.

The op configuration follows similar interface as LLVM intrinsic prefetch op, with a “locality” attribute that specifies the level of temporal locality in the application, that is, how soon would the same data be visited again. Possible locality values are: NONE, LOW, MEDIUM, and HIGH.

The op also takes a “cache tag” attribute giving hints on how the prefetched data will be used. Possible tags are: ReadICache, ReadDCache and WriteDCache.

Note: the actual behavior of the prefetch op and concrete interpretation of these attributes are target-dependent.

Fields:

  • rw (PrefetchRW): Indicates prefetching for read or write.
  • locality (PrefetchLocality): Indicates locality level.
  • cache (PrefetchCache): Indicates i-cache or d-cache prefetching.

Functions:

__init__

__init__() -> Self

Constructs an instance of PrefetchOptions with default params.

Returns:

The Prefetch configuration constructed.

for_read

for_read(self: Self) -> Self

Sets the prefetch purpose to read.

Returns:

The updated prefetch parameter.

for_write

for_write(self: Self) -> Self

Sets the prefetch purpose to write.

Returns:

The updated prefetch parameter.

no_locality

no_locality(self: Self) -> Self

Sets the prefetch locality to none.

Returns:

The updated prefetch parameter.

low_locality

low_locality(self: Self) -> Self

Sets the prefetch locality to low.

Returns:

The updated prefetch parameter.

medium_locality

medium_locality(self: Self) -> Self

Sets the prefetch locality to medium.

Returns:

The updated prefetch parameter.

high_locality

high_locality(self: Self) -> Self

Sets the prefetch locality to high.

Returns:

The updated prefetch parameter.

to_data_cache

to_data_cache(self: Self) -> Self

Sets the prefetch target to data cache.

Returns:

The updated prefetch parameter.

to_instruction_cache

to_instruction_cache(self: Self) -> Self

Sets the prefetch target to instruction cache.

Returns:

The updated prefetch parameter.

llvm_intrinsic

llvm_intrinsic[intrin: StringLiteral, type: AnyType]() -> *"type"

Calls an LLVM intrinsic with no arguments.

Calls an LLVM intrinsic with the name intrin and return type type.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.

Returns:

The result of calling the llvm intrinsic with no arguments.

llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType](arg0: T0) -> *"type"

Calls an LLVM intrinsic with one argument.

Calls the intrinsic with the name intrin and return type type on argument arg0.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.
  • T0 (AnyType): The type of the first argument to the intrinsic (arg0).

Args:

  • arg0 (T0): The argument to call the LLVM intrinsic with. The type of arg0 must be T0.

Returns:

The result of calling the llvm intrinsic with arg0 as an argument.

llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType](arg0: T0, arg1: T1) -> *"type"

Calls an LLVM intrinsic with two arguments.

Calls the LLVM intrinsic with the name intrin and return type type on arguments arg0 and arg1.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.
  • T0 (AnyType): The type of the first argument to the intrinsic (arg0).
  • T1 (AnyType): The type of the second argument to the intrinsic (arg1).

Args:

  • arg0 (T0): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0.
  • arg1 (T1): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1.

Returns:

The result of calling the llvm intrinsic with arg0 and arg1 as arguments.

llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType](arg0: T0, arg1: T1, arg2: T2) -> *"type"

Calls an LLVM intrinsic with three arguments.

Calls the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1 and arg2.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.
  • T0 (AnyType): The type of the first argument to the intrinsic (arg0).
  • T1 (AnyType): The type of the second argument to the intrinsic (arg1).
  • T2 (AnyType): The type of the third argument to the intrinsic (arg2).

Args:

  • arg0 (T0): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0.
  • arg1 (T1): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1.
  • arg2 (T2): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2.

Returns:

The result of calling the llvm intrinsic with arg0, arg1 and arg2 as arguments.

llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3) -> *"type"

Calls an LLVM intrinsic with four arguments.

Calls the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1, arg2 and arg3.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.
  • T0 (AnyType): The type of the first argument to the intrinsic (arg0).
  • T1 (AnyType): The type of the second argument to the intrinsic (arg1).
  • T2 (AnyType): The type of the third argument to the intrinsic (arg2).
  • T3 (AnyType): The type of the fourth argument to the intrinsic (arg3).

Args:

  • arg0 (T0): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0.
  • arg1 (T1): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1.
  • arg2 (T2): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2.
  • arg3 (T3): The fourth argument to call the LLVM intrinsic with. The type of arg3 must be T3.

Returns:

The result of calling the llvm intrinsic with arg0, arg1, arg2 and arg3 as arguments.

llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType, T4: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3, arg4: T4) -> *"type"

Calls an LLVM intrinsic with five arguments.

Calls the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1, arg2, arg3 and arg4.

Parameters:

  • intrin (StringLiteral): The name of the llvm intrinsic.
  • type (AnyType): The return type of the intrinsic.
  • T0 (AnyType): The type of the first argument to the intrinsic (arg0).
  • T1 (AnyType): The type of the second argument to the intrinsic (arg1).
  • T2 (AnyType): The type of the third argument to the intrinsic (arg2).
  • T3 (AnyType): The type of the fourth argument to the intrinsic (arg3).
  • T4 (AnyType): The type of the fifth argument to the intrinsic (arg4).

Args:

  • arg0 (T0): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0.
  • arg1 (T1): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1.
  • arg2 (T2): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2.
  • arg3 (T3): The fourth argument to call the LLVM intrinsic with. The type of arg3 must be T3.
  • arg4 (T4): The fourth argument to call the LLVM intrinsic with. The type of arg4 must be T4.

Returns:

The result of calling the llvm intrinsic with arg0, arg1, arg2, arg3 and arg4 as arguments.

external_call

external_call[callee: StringLiteral, type: AnyType]() -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.

Returns:

The external call result.

external_call[callee: StringLiteral, type: AnyType, T0: AnyType](arg0: T0) -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.
  • T0 (AnyType): The first argument type.

Args:

  • arg0 (T0): The first argument.

Returns:

The external call result.

external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType](arg0: T0, arg1: T1) -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.
  • T0 (AnyType): The first argument type.
  • T1 (AnyType): The second argument type.

Args:

  • arg0 (T0): The first argument.
  • arg1 (T1): The second argument.

Returns:

The external call result.

external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType](arg0: T0, arg1: T1, arg2: T2) -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.
  • T0 (AnyType): The first argument type.
  • T1 (AnyType): The second argument type.
  • T2 (AnyType): The third argument type.

Args:

  • arg0 (T0): The first argument.
  • arg1 (T1): The second argument.
  • arg2 (T2): The third argument.

Returns:

The external call result.

external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3) -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.
  • T0 (AnyType): The first argument type.
  • T1 (AnyType): The second argument type.
  • T2 (AnyType): The third argument type.
  • T3 (AnyType): The fourth argument type.

Args:

  • arg0 (T0): The first argument.
  • arg1 (T1): The second argument.
  • arg2 (T2): The third argument.
  • arg3 (T3): The fourth argument.

Returns:

The external call result.

external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType, T4: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3, arg4: T4) -> *"type"

Calls an external function.

Parameters:

  • callee (StringLiteral): The name of the external function.
  • type (AnyType): The return type.
  • T0 (AnyType): The first argument type.
  • T1 (AnyType): The second argument type.
  • T2 (AnyType): The third argument type.
  • T3 (AnyType): The fourth argument type.
  • T4 (AnyType): The fifth argument type.

Args:

  • arg0 (T0): The first argument.
  • arg1 (T1): The second argument.
  • arg2 (T2): The third argument.
  • arg3 (T3): The fourth argument.
  • arg4 (T4): The fifth argument.

Returns:

The external call result.

gather

gather[type: DType, size: Int](base: SIMD[address, size], mask: SIMD[bool, size], passthrough: SIMD[type, size], alignment: Int) -> SIMD[type, size]

Reads scalar values from a SIMD vector, and gathers them into one vector.

The gather function reads scalar values from a SIMD vector of memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers base as addresses. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the passthrough operand.

In general, for some vector of pointers base, mask mask, and passthrough pass a call of the form:

gather(base, mask, pass)

is equivalent to the following sequence of scalar loads in C++:

for (int i = 0; i < N; i++)
  result[i] = mask[i] ? *base[i] : passthrough[i];

Parameters:

  • type (DType): DType of the return SIMD buffer.
  • size (Int): Size of the return SIMD buffer.

Args:

  • base (SIMD[address, size]): The vector containing memory addresses that gather will access.
  • mask (SIMD[bool, size]): A binary vector which prevents memory access to certain lanes of the base vector.
  • passthrough (SIMD[type, size]): In the result vector, the masked-off lanes are replaced with the passthrough vector.
  • alignment (Int): The alignment of the source addresses. Must be 0 or a power of two constant integer value.

Returns:

A SIMD[type, size] containing the result of the gather operation.

scatter

scatter[type: DType, size: Int](value: SIMD[type, size], base: SIMD[address, size], mask: SIMD[bool, size], alignment: Int)

Takes scalar values from a SIMD vector and scatters them into a vector of pointers.

The scatter operation stores scalar values from a SIMD vector of memory locations and scatters them into a vector of pointers. The memory locations are provided in the vector of pointers base as addresses. The memory is stored according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.

The value operand is a vector value to be written to memory. The base operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The mask operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.

The behavior of the _scatter is undefined if the op stores into the same memory location more than once.

In general, for some vector %value, vector of pointers %base, and mask %mask instructions of the form:

%0 = pop.simd.scatter %value, %base[%mask] : !pop.simd<N, type>

is equivalent to the following sequence of scalar loads in C++:

for (int i = 0; i < N; i++)
  if (mask[i])
    base[i] = value[i];

Parameters:

  • type (DType): DType of value, the result SIMD buffer.
  • size (Int): Size of value, the result SIMD buffer.

Args:

  • value (SIMD[type, size]): The vector that will contain the result of the scatter operation.
  • base (SIMD[address, size]): The vector containing memory addresses that scatter will access.
  • mask (SIMD[bool, size]): A binary vector which prevents memory access to certain lanes of the base vector.
  • alignment (Int): The alignment of the source addresses. Must be 0 or a power of two constant integer value.

prefetch

prefetch[type: DType, params: PrefetchOptions](addr: DTypePointer[type])

Prefetches an instruction or data into cache before it is used.

The prefetch function provides prefetching hints for the target to prefetch instruction or data into cache before they are used.

Parameters:

  • type (DType): The DType of value stored in addr.
  • params (PrefetchOptions): Configuration options for the prefect intrinsic.

Args:

  • addr (DTypePointer[type]): The data pointer to prefetch.

masked_load

masked_load[type: DType, size: Int](addr: DTypePointer[type], mask: SIMD[bool, size], passthrough: SIMD[type, size], alignment: Int) -> SIMD[type, size]

Loads data from memory and return it, replacing masked lanes with values from the passthrough vector.

Parameters:

  • type (DType): DType of the return SIMD buffer.
  • size (Int): Size of the return SIMD buffer.

Args:

  • addr (DTypePointer[type]): The base pointer for the load.
  • mask (SIMD[bool, size]): A binary vector which prevents memory access to certain lanes of the memory stored at addr.
  • passthrough (SIMD[type, size]): In the result vector, the masked-off lanes are replaced with the passthrough vector.
  • alignment (Int): The alignment of the source addresses. Must be 0 or a power of two constant integer value. Default is 1.

Returns:

The loaded memory stored in a vector of type SIMD[type, size].

masked_store

masked_store[type: DType, size: Int](value: SIMD[type, size], addr: DTypePointer[type], mask: SIMD[bool, size], alignment: Int)

Stores a value at a memory location, skipping masked lanes.

Parameters:

  • type (DType): DType of value, the data to store.
  • size (Int): Size of value, the data to store.

Args:

  • value (SIMD[type, size]): The vector containing data to store.
  • addr (DTypePointer[type]): A vector of memory location to store data at.
  • mask (SIMD[bool, size]): A binary vector which prevents memory access to certain lanes of value.
  • alignment (Int): The alignment of the destination locations. Must be 0 or a power of two constant integer value.

compressed_store

compressed_store[type: DType, size: Int](value: SIMD[type, size], addr: DTypePointer[type], mask: SIMD[bool, size])

Compresses the lanes of value, skipping mask lanes, and stores at addr.

Parameters:

  • type (DType): DType of value, the value to store.
  • size (Int): Size of value, the value to store.

Args:

  • value (SIMD[type, size]): The vector containing data to store.
  • addr (DTypePointer[type]): The memory location to store the compressed data.
  • mask (SIMD[bool, size]): A binary vector which prevents memory access to certain lanes of value.

strided_load

strided_load[type: DType, simd_width: Int](addr: DTypePointer[type], stride: Int, mask: SIMD[bool, simd_width]) -> SIMD[type, simd_width]

Loads values from addr according to a specific stride.

Parameters:

  • type (DType): DType of value, the value to store.
  • simd_width (Int): The width of the SIMD vectors.

Args:

  • addr (DTypePointer[type]): The memory location to load data from.
  • stride (Int): How many lanes to skip before loading again.
  • mask (SIMD[bool, simd_width]): A binary vector which prevents memory access to certain lanes of value.

Returns:

A vector containing the loaded data.

strided_load[type: DType, simd_width: Int](addr: DTypePointer[type], stride: Int) -> SIMD[type, simd_width]

Loads values from addr according to a specific stride.

Parameters:

  • type (DType): DType of value, the value to store.
  • simd_width (Int): The width of the SIMD vectors.

Args:

  • addr (DTypePointer[type]): The memory location to load data from.
  • stride (Int): How many lanes to skip before loading again.

Returns:

A vector containing the loaded data.

strided_store

strided_store[type: DType, simd_width: Int](value: SIMD[type, simd_width], addr: DTypePointer[type], stride: Int, mask: SIMD[bool, simd_width])

Loads values from addr according to a specific stride.

Parameters:

  • type (DType): DType of value, the value to store.
  • simd_width (Int): The width of the SIMD vectors.

Args:

  • value (SIMD[type, simd_width]): The values to store.
  • addr (DTypePointer[type]): The location to store values at.
  • stride (Int): How many lanes to skip before storing again.
  • mask (SIMD[bool, simd_width]): A binary vector which prevents memory access to certain lanes of value.

strided_store[type: DType, simd_width: Int](value: SIMD[type, simd_width], addr: DTypePointer[type], stride: Int)

Loads values from addr according to a specific stride.

Parameters:

  • type (DType): DType of value, the value to store.
  • simd_width (Int): The width of the SIMD vectors.

Args:

  • value (SIMD[type, simd_width]): The values to store.
  • addr (DTypePointer[type]): The location to store values at.
  • stride (Int): How many lanes to skip before storing again.