Prefill
The first phase of an AI model's forward pass in which the model processes the input and initializes a cache to accelerate predictions.
Different model architectures may have their own version of a prefill, but it's primarily associated with large language models (LLMs), in which case it's also called context encoding.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!