Mojo function
parallel_memcpy
parallel_memcpy[type: DType](dest: UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1], src: UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1], count: Int, count_per_task: Int, num_tasks: Int)
Copies count
elements from a memory buffer src
to dest
in parallel by spawning num_tasks
tasks each copying count_per_task
elements.
Parameters:
- type (
DType
): The element dtype.
Args:
- dest (
UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1]
): The destination buffer. - src (
UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1]
): The source buffer. - count (
Int
): Number of elements in the buffer. - count_per_task (
Int
): Task size. - num_tasks (
Int
): Number of tasks to run in parallel.
parallel_memcpy[type: DType](dest: UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1], src: UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1], count: Int)
Copies count
elements from a memory buffer src
to dest
in parallel.
Parameters:
- type (
DType
): The element type.
Args:
- dest (
UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1]
): The destination pointer. - src (
UnsafePointer[SIMD[type, 1], 0, 0, alignof[::AnyType,__mlir_type.!kgen.target]() if triple_is_nvidia_cuda() else 1]
): The source pointer. - count (
Int
): The number of elements to copy.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
😔 What went wrong?