Intrinsics
Module
Defines intrinsics.
PrefetchCache
Prefetch cache type.
Aliases:
DATA = PrefetchCache(1)
The data prefetching option.
INSTRUCTION = PrefetchCache(0)
The instruction prefetching option.
Fields:
value
The cache prefetch. It should be in [0, 1].
Functions:
__init__
__init__(value: Int) -> Self
Construct a prefetch option.
Args:
- value (
Int
): An integer value representing the prefetch cache option to be used. Should be a value in the range[0, 1]
.
Returns:
The prefetch cache type that was constructed.
PrefetchLocality
The prefetch locality.
The locality, rw, and cache type correspond to LLVM prefetch intrinsic’s inputs (see LLVM prefetch locality)
Aliases:
HIGH = PrefetchLocality(3)
Extremely local locality (keep in cache).
LOW = PrefetchLocality(1)
Low locality.
MEDIUM = PrefetchLocality(2)
Medium locality.
NONE = PrefetchLocality(0)
No locality.
Fields:
value
The prefetch locality to use. It should be a value in [0, 3].
Functions:
__init__
__init__(value: Int) -> Self
Construct a prefetch locality option.
Args:
- value (
Int
): An integer value representing the locality. Should be a value in the range[0, 3]
.
Returns:
The prefetch locality constructed.
PrefetchOptions
Collection of configuration parameters for a prefetch intrinsic call.
The op configuration follows similar interface as LLVM intrinstic prefetch op, with a “locality” attribute that specifies the level of temporal locality in the application, that is, how soon would the same data be visited again. Possible locality values are: NONE
, LOW
, MEDIUM
, and HIGH
.
The op also takes a “cache tag” attribute giving hints on how the prefetched data will be used. Possible tags are: ReadICache
, ReadDCache
and WriteDCache
.
Note: the actual behavior of the prefetch op and concrete interpretation of these attributes are target-dependent.
Fields:
cache
Indicates i-cache or d-cache prefetching.
locality
Indicates locality level.
rw
Indicates prefetching for read or write.
Functions:
__init__
__init__() -> Self
Constructs an instance of PrefetchOptions with default params.
Returns:
The Prefetch configuration constructed.
for_read
for_read(self: Self) -> Self
Sets the prefetch purpose to read.
Returns:
The updated prefetch parameter.
for_write
for_write(self: Self) -> Self
Sets the prefetch purpose to write.
Returns:
The updated prefetch parameter.
high_locality
high_locality(self: Self) -> Self
Sets the prefetch locality to high.
Returns:
The updated prefetch parameter.
low_locality
low_locality(self: Self) -> Self
Sets the prefetch locality to low.
Returns:
The updated prefetch parameter.
medium_locality
medium_locality(self: Self) -> Self
Sets the prefetch locality to medium.
Returns:
The updated prefetch parameter.
no_locality
no_locality(self: Self) -> Self
Sets the prefetch locality to none.
Returns:
The updated prefetch parameter.
to_data_cache
to_data_cache(self: Self) -> Self
Sets the prefetch target to data cache.
Returns:
The updated prefetch parameter.
to_instruction_cache
to_instruction_cache(self: Self) -> Self
Sets the prefetch target to instruction cache.
Returns:
The updated prefetch parameter.
PrefetchRW
Prefetch read or write.
Aliases:
READ = PrefetchRW(0)
Read prefetch.
WRITE = PrefetchRW(1)
Write prefetch.
Fields:
value
The read-write prefetch. It should be in [0, 1].
Functions:
__init__
__init__(value: Int) -> Self
Construct a prefetch read-write option.
Args:
- value (
Int
): An integer value representing the prefetch read-write option to be used. Should be a value in the range[0, 1]
.
Returns:
The prefetch read-write option constructed.
compressed_store
compressed_store[size: Int, type: DType](value: SIMD[type, size], addr: DTypePointer[type], mask: SIMD[bool, size])
Compress the lanes of value
, skipping masked lanes, and store at addr.
Parameters:
- size (
Int
): Size ofvalue
, the value to store. - type (
DType
): DType ofvalue
, the value to store.
Args:
- value (
SIMD[type, size]
): The vector containing data to store. - addr (
DTypePointer[type]
): The memory location to store the compressed data. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes ofvalue
.
external_call
external_call[callee: StringLiteral, type: AnyType]() -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type.
Returns:
The external call result.
external_call[callee: StringLiteral, type: AnyType, T0: AnyType](arg0: T0) -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type. - T0 (
AnyType
): The first argument type.
Args:
- arg0 (
T0
): The first argument.
Returns:
The external call result.
external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType](arg0: T0, arg1: T1) -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type. - T0 (
AnyType
): The first argument type. - T1 (
AnyType
): The second argument type.
Args:
- arg0 (
T0
): The first argument. - arg1 (
T1
): The second argument.
Returns:
The external call result.
external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType](arg0: T0, arg1: T1, arg2: T2) -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type. - T0 (
AnyType
): The first argument type. - T1 (
AnyType
): The second argument type. - T2 (
AnyType
): The third argument type.
Args:
- arg0 (
T0
): The first argument. - arg1 (
T1
): The second argument. - arg2 (
T2
): The third argument.
Returns:
The external call result.
external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3) -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type. - T0 (
AnyType
): The first argument type. - T1 (
AnyType
): The second argument type. - T2 (
AnyType
): The third argument type. - T3 (
AnyType
): The fourth argument type.
Args:
- arg0 (
T0
): The first argument. - arg1 (
T1
): The second argument. - arg2 (
T2
): The third argument. - arg3 (
T3
): The fourth argument.
Returns:
The external call result.
external_call[callee: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType, T4: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3, arg4: T4) -> type
Call an external function.
Parameters:
- callee (
StringLiteral
): The name of the external function. - type (
AnyType
): The return type. - T0 (
AnyType
): The first argument type. - T1 (
AnyType
): The second argument type. - T2 (
AnyType
): The third argument type. - T3 (
AnyType
): The fourth argument type. - T4 (
AnyType
): The fifth argument type.
Args:
- arg0 (
T0
): The first argument. - arg1 (
T1
): The second argument. - arg2 (
T2
): The third argument. - arg3 (
T3
): The fourth argument. - arg4 (
T4
): The fifth argument.
Returns:
The external call result.
gather
gather[size: Int, type: DType](base: SIMD[address, size], mask: SIMD[bool, size], passthrough: SIMD[type, size], alignment: Int) -> SIMD[type, size]
Read scalar values from a SIMD vector, and gather them into one vector.
The gather function reads scalar values from a SIMD vector of memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers base
as addresses. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the passthrough
operand.
In general, for some vector of pointers base
, mask mask
, and passthrough pass
a call of the form:
pass) gather(base, mask,
is equivalent to the following sequence of scalar loads in C++:
for (int i = 0; i < N; i++)
[i] = mask[i] ? *base[i] : passthrough[i]; result
Parameters:
- size (
Int
): Size of the return SIMD buffer. - type (
DType
): DType of the return SIMD buffer.
Args:
- base (
SIMD[address, size]
): The vector containing memory addresses that gather will access. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes of the base vector. - passthrough (
SIMD[type, size]
): In the result vector, the masked-off lanes are replaced with the passthrough vector. - alignment (
Int
): The alignment of the source addresses. Must be 0 or a power of two constant integer value.
Returns:
A SIMD[type, size] containing the result of the gather operation.
llvm_intrinsic
llvm_intrinsic[intrin: StringLiteral, type: AnyType]() -> type
Call an LLVM intrinsic with no arguments.
Call an LLVM intrinsic with the name intrin and return type type.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic.
Returns:
The result of calling the llvm intrinsic with no arguments.
llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType](arg0: T0) -> type
Call an LLVM intrinsic with one argument.
Call the intrinsic with the name intrin and return type type on argument arg0.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic. - T0 (
AnyType
): The type of the first argument to the intrinsic (arg0).
Args:
- arg0 (
T0
): The argument to call the LLVM intrinsic with. The type of arg0 must be T0.
Returns:
The result of calling the llvm intrinsic with arg0 as an argument.
llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType](arg0: T0, arg1: T1) -> type
Call an LLVM intrinsic with two arguments.
Call the LLVM intrinsic with the name intrin and return type type on arguments arg0 and arg1.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic. - T0 (
AnyType
): The type of the first argument to the intrinsic (arg0). - T1 (
AnyType
): The type of the second argument to the intrinsic (arg1).
Args:
- arg0 (
T0
): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0. - arg1 (
T1
): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1.
Returns:
The result of calling the llvm intrinsic with arg0 and arg1 as arguments.
llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType](arg0: T0, arg1: T1, arg2: T2) -> type
Call an LLVM intrinsic with three arguments.
Call the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1 and arg2.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic. - T0 (
AnyType
): The type of the first argument to the intrinsic (arg0). - T1 (
AnyType
): The type of the second argument to the intrinsic (arg1). - T2 (
AnyType
): The type of the third argument to the intrinsic (arg2).
Args:
- arg0 (
T0
): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0. - arg1 (
T1
): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1. - arg2 (
T2
): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2.
Returns:
The result of calling the llvm intrinsic with arg0, arg1 and arg2 as arguments.
llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3) -> type
Call an LLVM intrinsic with four arguments.
Call the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1, arg2 and arg3.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic. - T0 (
AnyType
): The type of the first argument to the intrinsic (arg0). - T1 (
AnyType
): The type of the second argument to the intrinsic (arg1). - T2 (
AnyType
): The type of the third argument to the intrinsic (arg2). - T3 (
AnyType
): The type of the fourth argument to the intrinsic (arg3).
Args:
- arg0 (
T0
): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0. - arg1 (
T1
): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1. - arg2 (
T2
): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2. - arg3 (
T3
): The fourth argument to call the LLVM intrinsic with. The type of arg3 must be T3.
Returns:
The result of calling the llvm intrinsic with arg0, arg1, arg2 and arg3 as arguments.
llvm_intrinsic[intrin: StringLiteral, type: AnyType, T0: AnyType, T1: AnyType, T2: AnyType, T3: AnyType, T4: AnyType](arg0: T0, arg1: T1, arg2: T2, arg3: T3, arg4: T4) -> type
Call an LLVM intrinsic with five arguments.
Call the LLVM intrinsic with the name intrin and return type type on arguments arg0, arg1, arg2, arg3 and arg4.
Parameters:
- intrin (
StringLiteral
): The name of the llvm intrinsic. - type (
AnyType
): The return type of the intrinsic. - T0 (
AnyType
): The type of the first argument to the intrinsic (arg0). - T1 (
AnyType
): The type of the second argument to the intrinsic (arg1). - T2 (
AnyType
): The type of the third argument to the intrinsic (arg2). - T3 (
AnyType
): The type of the fourth argument to the intrinsic (arg3). - T4 (
AnyType
): The type of the fifth argument to the intrinsic (arg4).
Args:
- arg0 (
T0
): The first argument to call the LLVM intrinsic with. The type of arg0 must be T0. - arg1 (
T1
): The second argument to call the LLVM intrinsic with. The type of arg1 must be T1. - arg2 (
T2
): The third argument to call the LLVM intrinsic with. The type of arg2 must be T2. - arg3 (
T3
): The fourth argument to call the LLVM intrinsic with. The type of arg3 must be T3. - arg4 (
T4
): The fourth argument to call the LLVM intrinsic with. The type of arg4 must be T4.
Returns:
The result of calling the llvm intrinsic with arg0, arg1, arg2, arg3 and arg4 as arguments.
masked_load
masked_load[size: Int, type: DType](addr: DTypePointer[type], mask: SIMD[bool, size], passthrough: SIMD[type, size], alignment: Int) -> SIMD[type, size]
Load data from memory and return it, replacing masked lanes with values from the passthrough vector.
Parameters:
- size (
Int
): Size of the return SIMD buffer. - type (
DType
): DType of the return SIMD buffer.
Args:
- addr (
DTypePointer[type]
): The base pointer for the load. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes of the memory stored at addr. - passthrough (
SIMD[type, size]
): In the result vector, the masked-off lanes are replaced with the passthrough vector. - alignment (
Int
): The alignment of the source addresses. Must be 0 or a power of two constant integer value. Default is 1.
Returns:
The loaded memory stored in a vetor of type SIMD[type, size].
masked_store
masked_store[size: Int, type: DType](value: SIMD[type, size], addr: DTypePointer[type], mask: SIMD[bool, size], alignment: Int)
Store a value at a memory location, skipping masked lanes.
Parameters:
- size (
Int
): Size ofvalue
, the data to store. - type (
DType
): DType ofvalue
, the data to store.
Args:
- value (
SIMD[type, size]
): The vector containing data to store. - addr (
DTypePointer[type]
): A vector of memory location to store data at. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes ofvalue
. - alignment (
Int
): The alignment of the destination locations. Must be 0 or a power of two constant integer value.
prefetch
prefetch[type: DType, params: PrefetchOptions](addr: DTypePointer[type])
Prefetch an instruction or data into cache before it is used.
The prefetch function provides prefetching hints for the target to prefetch instruction or data into cache before they are used.
Parameters:
- type (
DType
): The DType of value stored in addr. - params (
PrefetchOptions
): Configuration options for the prefect intrinsic.
Args:
- addr (
DTypePointer[type]
): The data pointer to prefetch.
scatter
scatter[size: Int, type: DType](value: SIMD[type, size], base: SIMD[address, size], mask: SIMD[bool, size], alignment: Int)
Scatter takes scalar values from a SIMD vector and scatters
them into a vector of pointers.
The scatter operation stores scalar values from a SIMD vector of memory locations and scatters them into a vector of pointers. The memory locations are provided in the vector of pointers base
as addresses. The memory is stored according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
The value
operand is a vector value to be written to memory. The base
operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The mask
operand, mask, is a vector of boolean values. The types of the mask
and the value
operand must have the same number of vector elements.
The behavior of the _scatter is undefined if the op stores into the same memory location more than once.
In general, for some vector %value, vector of pointers %base, and mask %mask instructions of the form:
%0 = pop.simd.scatter %value, %base[%mask] : !pop.simd<N, type>
is equivalent to the following sequence of scalar loads in C++:
for (int i = 0; i < N; i++)
if (mask[i])
[i] = value[i]; base
Parameters:
- size (
Int
): Size ofvalue
, the result SIMD buffer. - type (
DType
): DType ofvalue
, the result SIMD buffer.
Args:
- value (
SIMD[type, size]
): The vector that will contain the result of the scatter operation. - base (
SIMD[address, size]
): The vector containing memory addresses that scatter will access. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes of the base vector. - alignment (
Int
): The alignment of the source addresses. Must be 0 or a power of two constant integer value.
strided_load
strided_load[size: Int, type: DType](addr: DTypePointer[type], stride: Int, mask: SIMD[bool, size]) -> SIMD[type, size]
Load values from addr according to a specific stride.
Parameters:
- size (
Int
): Size of the result vector. - type (
DType
): DType of the result vector.
Args:
- addr (
DTypePointer[type]
): The memory location to load data from. - stride (
Int
): How many lanes to skip before loading again. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes ofvalue
.
Returns:
A vector containing the loaded data.
strided_load[size: Int, type: DType](addr: DTypePointer[type], stride: Int) -> SIMD[type, size]
Load values from addr according to a specific stride.
Parameters:
- size (
Int
): Size of the result vector. - type (
DType
): DType of the result vector.
Args:
- addr (
DTypePointer[type]
): The memory location to load data from. - stride (
Int
): How many lanes to skip before loading again.
Returns:
A vector containing the loaded data.
strided_store
strided_store[size: Int, type: DType](value: SIMD[type, size], addr: DTypePointer[type], stride: Int, mask: SIMD[bool, size])
Load values from addr according to a specific stride.
Parameters:
- size (
Int
): Size ofvalue
, the value to store. - type (
DType
): DType ofvalue
, the value to store.
Args:
- value (
SIMD[type, size]
): The values to store. - addr (
DTypePointer[type]
): The location to store values at. - stride (
Int
): How many lanes to skip before storing again. - mask (
SIMD[bool, size]
): A binary vector which prevents memory access to certain lanes ofvalue
.
strided_store[size: Int, type: DType](value: SIMD[type, size], addr: DTypePointer[type], stride: Int)
Load values from addr according to a specific stride.
Parameters:
- size (
Int
): Size ofvalue
, the value to store. - type (
DType
): DType ofvalue
, the value to store.
Args:
- value (
SIMD[type, size]
): The values to store. - addr (
DTypePointer[type]
): The location to store values at. - stride (
Int
): How many lanes to skip before storing again.