Skip to main content

Python class

FirstBlockCache

FirstBlockCache

class max.pipelines.modeling.base.FirstBlockCache(dtype, device)

source

Bases: object

Standalone FirstBlockCache module.

Provides state allocation for FBCache. The conditional execution helpers (can_use_fbcache, fbcache_conditional_execution) remain in cache_mixin.py since they are used directly inside transformer _forward_fbcache methods.

Parameters:

create_state()

create_state(batch_size, seq_len, residual_dim, output_dim)

source

Allocate fresh per-request FirstBlockCache state tensors.

Parameters:

  • batch_size (int)
  • seq_len (int)
  • residual_dim (int)
  • output_dim (int)

Return type:

FirstBlockCacheState