# Functional

Module

Implements higher-order functions.

**Aliases:**

`BinaryTile1DTileUnitFunc = fn[Int](Int, Int) capturing -> None`

Signature of a tiled function that performs some work with a dynamic tile size and a secondary static tile size.

`Dynamic1DTileUnitFunc = fn(Int, Int) capturing -> None`

Signature of a 1d tiled function that performs some work with a dynamic tile size and an offset. i.e. func(offset: Int, tile_size: Int)

`Dynamic1DTileUnswitchUnitFunc = fn[Bool](Int, Int, Int) capturing -> None`

`Static1DTileUnitFunc = fn[Int](Int) capturing -> None`

Signature of a 1d tiled function that performs some work with a static tile size and an offset. i.e. func<tile_size: Int> (offset: Int)

`Static1DTileUnswitchUnitFunc = fn[Int, Bool](Int, Int) capturing -> None`

Signature of a tiled function that performs some work with a static tile size and an offset. i.e. func<tile_size: Int> (offset: Int)

`Static2DTileUnitFunc = fn[Int, Int](Int, Int) capturing -> None`

Signature of a 2d tiled function that performs some work with a static tile size and an offset. i.e. func<tile_size_x: Int, tile_size_y: Int> (offset_x: Int, offset_y: Int)

`SwitchedFunction = fn[Bool]() capturing -> None`

`SwitchedFunction2 = fn[Bool, Bool]() capturing -> None`

`async_parallelize`

`async_parallelize[func: fn(Int) capturing -> None](out_chain: OutputChainPtr, num_work_items: Int)`

Execute func(0) â€¦ func(num_work_items-1) as sub-tasks in parallel and return imediatly.

Execute func(0) â€¦ func(num_work_items-1) as sub-tasks in parallel and mark out_chain as ready when all functions have returned. This function will return when the sub-tasks have been scheduled but not necessarily completed. The runtime may execute the sub-tasks in any order and with any degree of concurrency.

All free variables in func must be â€śasync safeâ€ť. Currently this means: - The variable must be bound by a by-val function argument (ie no &), or let binding. - The variableâ€™s type must be â€śasync safeâ€ť, ie is marked as @register_passable and any internal pointers are to memory with lifetime at least until out_chain is ready. In practice, this means only pointers to buffers held alive by the runtime.

If num_work_items is 0 then the out_chain is marked as ready before async_parallelize returns. If num_work_items is 1 then func(0) is still executed as a sub-task.

**Parameters:**

**func**(`fn(Int) capturing -> None`

): The function to invoke.

**Args:**

**out_chain**(`OutputChainPtr`

): Out chain to attach results too.**num_work_items**(`Int`

): Number of parallel tasks.

`elementwise`

`elementwise[rank: Int, simd_width: Int, unroll_factor: Int, func: fn[Int, Int](StaticIntTuple[*(0,1)]) capturing -> None](shape: StaticIntTuple[rank], out_chain: OutputChainPtr)`

Execute funcwidth, rank as sub-tasks for a suitable combination of width and indices so as to cover shape.

**Parameters:**

**rank**(`Int`

): The rank of the buffer.**simd_width**(`Int`

): The SIMD vector width to use.**unroll_factor**(`Int`

): The unroll factor to use.**func**(`fn[Int, Int](StaticIntTuple[*(0,1)]) capturing -> None`

): The body function.

**Args:**

**shape**(`StaticIntTuple[rank]`

): The shape of the buffer.**out_chain**(`OutputChainPtr`

): The our chain to attach results to.

`elementwise[rank: Int, simd_width: Int, unroll_factor: Int, func: fn[Int, Int](StaticIntTuple[*(0,1)]) capturing -> None](shape: StaticIntTuple[rank], out_chain: OutputChainPtr)`

Execute funcwidth, rank as sub-tasks for a suitable combination of width and indices so as to cover shape.

All free vars in func must be â€śasync safeâ€ť, see async_parallelize.

**Parameters:**

**rank**(`Int`

): The rank of the buffer.**simd_width**(`Int`

): The SIMD vector width to use.**unroll_factor**(`Int`

): The unroll factor to use.**func**(`fn[Int, Int](StaticIntTuple[*(0,1)]) capturing -> None`

): The body function.

**Args:**

**shape**(`StaticIntTuple[rank]`

): The shape of the buffer.**out_chain**(`OutputChainPtr`

): The our chain to attach results to.

`get_num_workers`

`get_num_workers(problem_size: Int, runtime: Runtime) -> Int`

Return a number of workers to run in parallel.

**Args:**

**problem_size**(`Int`

): The number of parallel tasks.**runtime**(`Runtime`

): Runtime object.

**Returns:**

The number of workers to run in parallel.

`map`

`map[func: fn(Int) capturing -> None](size: Int)`

Map a function over a range from 0 to size.

**Parameters:**

**func**(`fn(Int) capturing -> None`

): Function to map.

**Args:**

**size**(`Int`

): The number of elements.

`parallelize`

`parallelize[func: fn(Int) capturing -> None]()`

Execute func(0) â€¦ func(N-1) as sub-tasks in parallel and block until completion. N is chosen to be the number of physical processors on the system.

Execute func(0) â€¦ func(N-1) as sub-tasks in parallel. This function will return after all the sub-tasks have completely been executed (and unlike async_parallelize this is a blocking operation).

**Parameters:**

**func**(`fn(Int) capturing -> None`

): The function to invoke.

`parallelize[func: fn(Int) capturing -> None](num_work_items: Int)`

Execute func(0) â€¦ func(num_work_items-1) as sub-tasks in parallel and block until completion.

Execute func(0) â€¦ func(num_work_items-1) as sub-tasks in parallel. This function will return after all the sub-tasks have completely been executed (and unlike async_parallelize this is a blocking operation).

**Parameters:**

**func**(`fn(Int) capturing -> None`

): The function to invoke.

**Args:**

**num_work_items**(`Int`

): Number of parallel tasks.

`tile`

`tile[workgroup_function: fn[Int](Int) capturing -> None, tile_size_list: VariadicList[Int]](offset: Int, upperbound: Int)`

A generator that launches work groups in specified list of tile sizes.

A workgroup function is a function that can process a configurable consecutive â€śtileâ€ť of workload. E.g. work_on3 should launch computation on item 5,6,7, and should be semantically equivalent to work_on1, work_on1, work_on1.

This generator will try to proceed with the given list of tile sizes on the listed order. E.g. tile func, (3,2,1) will try to call func[3] starting from offset until remaining work is less than 3 from upperbound and then try func[2], and then func[1] etc.

**Parameters:**

**workgroup_function**(`fn[Int](Int) capturing -> None`

): Workgroup function that processes one tile of workload.**tile_size_list**(`VariadicList[Int]`

): List of tile sizes to launch work.

**Args:**

**offset**(`Int`

): The initial index to start the work from.**upperbound**(`Int`

): The runtime upperbound that the work function should not exceed.

`tile[workgroup_function: fn(Int, Int) capturing -> None](offset: Int, upperbound: Int, tile_size_list: VariadicList[Int])`

A generator that launches work groups in specified list of tile sizes.

This is the version of tile generator for the case where work_group function can take the tile size as a runtime value.

**Parameters:**

**workgroup_function**(`fn(Int, Int) capturing -> None`

): Workgroup function that processes one tile of workload.

**Args:**

**offset**(`Int`

): The initial index to start the work from.**upperbound**(`Int`

): The runtime upperbound that the work function should not exceed.**tile_size_list**(`VariadicList[Int]`

): List of tile sizes to launch work.

`tile[secondary_tile_size_list: VariadicList[Int], secondary_cleanup_tile: Int, workgroup_function: fn[Int](Int, Int) capturing -> None](offset: Int, upperbound: Int, primary_tile_size_list: VariadicList[Int], primary_cleanup_tile: Int)`

A generator that launches work groups in specified list of tile sizes until the sum of primary_tile_sizes has exceeded the upperbound.

**Parameters:**

**secondary_tile_size_list**(`VariadicList[Int]`

): List of static tile sizes to launch work.**secondary_cleanup_tile**(`Int`

): Last static tile to use when primary tile sizes donâ€™t fit exactly within the upperbound.**workgroup_function**(`fn[Int](Int, Int) capturing -> None`

): Workgroup function that processes one tile of workload.

**Args:**

**offset**(`Int`

): The initial index to start the work from.**upperbound**(`Int`

): The runtime upperbound that the work function should not exceed.**primary_tile_size_list**(`VariadicList[Int]`

): List of dynamic tile sizes to launch work.**primary_cleanup_tile**(`Int`

): Last dynamic tile to use when primary tile sizes donâ€™t fit exactly within the upperbound.

`tile[workgroup_function: fn[Int, Int](Int, Int) capturing -> None, tile_sizes_x: VariadicList[Int], tile_sizes_y: VariadicList[Int]](offset_x: Int, offset_y: Int, upperbound_x: Int, upperbound_y: Int)`

Launches workgroup_function using the largest tile sizes possible in each dimension, starting from the x and y offset, until the x and y upperbounds are reached.

**Parameters:**

**workgroup_function**(`fn[Int, Int](Int, Int) capturing -> None`

): Funtion that is invoked for each tile and offset.**tile_sizes_x**(`VariadicList[Int]`

): List of tile sizes to use for the first parameter of workgroup_function.**tile_sizes_y**(`VariadicList[Int]`

): List of tile sizes to use for the second parameter of workgroup_function.

**Args:**

**offset_x**(`Int`

): Initial x offset passed to workgroup_function.**offset_y**(`Int`

): Initial y offset passed to workgroup_function.**upperbound_x**(`Int`

): Max offset in x dimension passed to workgroup function.**upperbound_y**(`Int`

): Max offset in y dimension passed to workgroup function.

`tile_and_unswitch`

`tile_and_unswitch[workgroup_function: fn[Int, Bool](Int, Int) capturing -> None, tile_size_list: VariadicList[Int]](offset: Int, upperbound: Int)`

Perform time and unswitch functional transformation.

A variant of static tile given a workgroup function that can be unswitched. This generator is a fused version of tile and unswitch, where the static unswitch is true throughout the â€śinnerâ€ť portion of the workload and is false only on the residue tile.

**Parameters:**

**workgroup_function**(`fn[Int, Bool](Int, Int) capturing -> None`

): Workgroup function that processes one tile of workload.**tile_size_list**(`VariadicList[Int]`

): List of tile sizes to launch work.

**Args:**

**offset**(`Int`

): The initial index to start the work from.**upperbound**(`Int`

): The runtime upperbound that the work function should not exceed.

`tile_and_unswitch[workgroup_function: fn[Bool](Int, Int, Int) capturing -> None](offset: Int, upperbound: Int, tile_size_list: VariadicList[Int])`

Perform time and unswitch functional transformation.

A variant of dynamic tile given a workgroup function that can be unswitched. This generator is a fused version of tile and unswitch, where the static unswitch is true throughout the â€śinnerâ€ť portion of the workload and is false only on the residue tile.

**Parameters:**

**workgroup_function**(`fn[Bool](Int, Int, Int) capturing -> None`

): Workgroup function that processes one tile of workload.

**Args:**

**offset**(`Int`

): The initial index to start the work from.**upperbound**(`Int`

): The runtime upperbound that the work function should not exceed.**tile_size_list**(`VariadicList[Int]`

): List of tile sizes to launch work.

`unroll`

`unroll[count: Int, func: fn[Int]() capturing -> None]()`

Repeatedly evaluate a function `count`

times.

**Parameters:**

**count**(`Int`

): A number of repetitions.**func**(`fn[Int]() capturing -> None`

): The function to evaluate. The function should take a single Int argument.

`unroll2`

`unroll2[dim0: Int, dim1: Int, func: fn[Int, Int]() capturing -> None]()`

Repeateadly evaluate a 2D nested loop.

**Parameters:**

**dim0**(`Int`

): The first dimension size.**dim1**(`Int`

): The second dimension size.**func**(`fn[Int, Int]() capturing -> None`

): The function to evaluate. The function should take two Int arguments.

`unroll3`

`unroll3[dim0: Int, dim1: Int, dim2: Int, func: fn[Int, Int, Int]() capturing -> None]()`

Repeateadly evaluate a 3D nested loop.

**Parameters:**

**dim0**(`Int`

): The first dimension size.**dim1**(`Int`

): The second dimension size.**dim2**(`Int`

): The second dimension size.**func**(`fn[Int, Int, Int]() capturing -> None`

): The function to evaluate. The function should take three Int arguments.

`unswitch`

`unswitch[switched_func: fn[Bool]() capturing -> None](dynamic_switch: Bool)`

Perform a functional unswitch transformation.

Unswitch is a simple pattern that is similar idea to loop unswitching pass but extended to functional patterns. The pattern facilitates the following code transformation that reduces the number of branches in the generated code

```
Before:
```
for i in range(...)
if i < xxx:
...
```
After:
```
if i < ...
for i in range(...)
...
else
for i in range(...)
if i < xxx:
...
```
```

This unswitch function genralizes that pattern with the help of meta parame- ters and can be used to perform both loop unswitching and other tile predic- ate lifting like in simd and amx.

TODO: Generalize to support multiple predicates. TODO: Once nested lambdas compose well should make unswitch compose with tile in an easy way.

**Parameters:**

**switched_func**(`fn[Bool]() capturing -> None`

): The function containing the inner loop logic that can be unswitched.

**Args:**

**dynamic_switch**(`Bool`

): The dynamic condition that enables the unswitched code path.

`unswitch[switched_func: fn[Bool, Bool]() capturing -> None](dynamic_switch_a: Bool, dynamic_switch_b: Bool)`

Perform a functional 2-predicates unswitch transformation.

**Parameters:**

**switched_func**(`fn[Bool, Bool]() capturing -> None`

): The function containing the inner loop logic that has 2 predicates which can be unswitched.

**Args:**

**dynamic_switch_a**(`Bool`

): The first dynamic condition that enables the outer unswitched code path.**dynamic_switch_b**(`Bool`

): The second dynamic condition that enables the inner unswitched code path.

`vectorize`

`vectorize[simd_width: Int, func: fn[Int](Int) capturing -> None](size: Int)`

Map a function which is parametrized over a simd_width over a range from 0 to size in simd fashion.

**Parameters:**

**simd_width**(`Int`

): The SIMD vector width.**func**(`fn[Int](Int) capturing -> None`

): The function for the loop body.

**Args:**

**size**(`Int`

): The total loop count.

`vectorize_unroll`

`vectorize_unroll[simd_width: Int, unroll_factor: Int, func: fn[Int](Int) capturing -> None](size: Int)`

Map a function which is parametrized over a simd_width over a range from 0 to size in simd fashion and unroll the loop by unroll_factor.

**Parameters:**

**simd_width**(`Int`

): The SIMD vector width.**unroll_factor**(`Int`

): The unroll factor for the main loop.**func**(`fn[Int](Int) capturing -> None`

): The function for the loop body.

**Args:**

**size**(`Int`

): The total loop count.