Mojo function
enqueue_function_from_cubin
enqueue_function_from_cubin[*Ts: DevicePassable](ctx: DeviceContext, data: List[Byte], module_name: String, function_name: String, *args: *Ts, *, grid_dim: Dim, block_dim: Dim, shared_mem_bytes: OptionalReg[Int] = None, var attributes: List[LaunchAttribute] = List[LaunchAttribute](, Tuple[]()))
Loads and enqueues a Nvidia GPU kernel from cubin.
This function provides a simple way to load pre-compiled GPU code and launch it in a single call. It handles loading the function, enqueueing it, and releasing the function handle.
Example:
from gpu.host import DeviceContext
from gpu.host.device_context import enqueue_function_from_cubin
from pathlib import Path
var cubin_data = Path("kernel.cubin").read_bytes()
with DeviceContext() as ctx:
enqueue_function_from_cubin(
ctx,
cubin_data,
"my_module",
"my_kernel",
arg1,
arg2,
grid_dim=(num_blocks, 1, 1),
block_dim=(256, 1, 1),
)
ctx.synchronize()Parameters:
- *Ts (
DevicePassable): The types of the arguments, must implement DevicePassable.
Args:
- ctx (
DeviceContext): The device context to use. - data (
List): The binary data (e.g., cubin file contents) containing the compiled kernel. - module_name (
String): The name of the module. - function_name (
String): The name of the kernel function entry point. - *args (
*Ts): The arguments to pass to the kernel. - grid_dim (
Dim): Grid dimensions for the kernel launch. - block_dim (
Dim): Block dimensions for the kernel launch. - shared_mem_bytes (
OptionalReg): Amount of dynamic shared memory in bytes. - attributes (
List): Optional list of launch attributes.
Raises:
If loading or launching the function fails.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!