Mojo function
tcgen05_cp
tcgen05_cp[*, cta_group: SIMD[int32, 1], datapaths: Int, bits: Int, src_fmt: String = __init__[__mlir_type.!kgen.string](""), dst_fmt: String = __init__[__mlir_type.!kgen.string](""), multicast: String = __init__[__mlir_type.!kgen.string]("")](tmem_addr: SIMD[uint32, 1], s_desc: MMASmemDescriptor)
Copies data from shared memory described by the matrix descriptor s_desc
to tensor memory tmem_addr
.
Note: This function is only available on NVIDIA Blackwell GPUs (SM 100+).
Parameters:
- cta_group (
SIMD[int32, 1]
): The cooperative thread array (CTA) group ID. - datapaths (
Int
): The first dimension of the shape. - bits (
Int
): The second dimension of the shape. - src_fmt (
String
): Source format string. - dst_fmt (
String
): Destination format string. - multicast (
String
): Multicast string.
Args:
- tmem_addr (
SIMD[uint32, 1]
): Address of the tensor memory. - s_desc (
MMASmemDescriptor
): Matrix descriptor for the copy operation.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!