IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Autoregression

Autoregression is a process by which an AI model iteratively predicts future values based on previous values in a sequence, using its own output as input to itself. Because each prediction depends on prior context, the process is sequential, which limits parallelization.

Autoregression is a standard procedure in transformer models such as large language models (LLMs) and other models that perform time-series forecasting. This autoregressive process explains why AI chat bots like ChatGPT and Gemini stream the output one word at a time—although they sometimes run so fast that they appear to produce more than one word at a time.

Was this page helpful?