For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Bring up a model with AI agent skills
You can accelerate bringing a new large language model architecture to MAX using AI coding agents equipped with Modular's official agent skills. These skills define automated, step-by-step workflows that let agents inspect Hugging Face checkpoints, scaffold custom architectures from similar models, implement layer-level differences, and run verification loops.
By delegating the mechanical tasks of mapping configurations and remapping weight keys to an agent, you can focus on directing high-level architecture decisions and verifying the final inference results.
Install the MAX skills
To equip your AI coding agent with the model bring-up workflow, you must install the MAX skills.
Install via npx
If you have Node.js installed, you can add all Modular agent skills to your assistant with a single command:
npx skills add modular/skillsIf you only want to install the model bring-up skill in isolation, specify the
--skill flag:
npx skills add modular/skills --skill import-modelKeep your skills up to date with the latest best practices by running:
npx skills updateManual installation
If you prefer to install the skills manually, clone the official repository:
git clone https://github.com/modular/skills.gitAfter cloning, copy or symlink the individual skills into your AI agent's
configuration directory. For Claude Code, copy the directories into
~/.claude/skills/:
cp -r skills/import-model ~/.claude/skills/Consult your specific agent's documentation to find its configuration and skills directory.
Start the model bring-up
To begin, open your AI coding agent in your project workspace and instruct it to import the model using its Hugging Face model ID.
Here are a few example prompts you can use to start the workflow:
Import the Hugging Face model "Qwen/Qwen2.5-7B-Instruct" into MAX.Please bring up the Hugging Face model "microsoft/Phi-3-mini-4k-instruct" in MAX. Start from the llama3 architecture as the donor.I want to add a new causal language model architecture to MAX. The Hugging Face model ID is "allenai/OLMo-2-1124-7B".After receiving the prompt, the agent initializes the decide and plan phase and presents the bring-up plan for your review.
How agent-driven model bring-up works
The import-model skill drives a three-phase workflow—decide and plan,
implement, then verify—while you remain the coordinator and validator at each
checkpoint. The sections below describe what the agent delivers in each phase
and how you steer it. For the full procedure, see the skill's
README.md.
Decide and plan
The agent inspects the target model's configuration, selects the closest
existing MAX architecture as a donor template (such as llama3 or qwen3), and
analyzes the structural differences between the two. It then presents a written
plan listing the chosen donor and the catalog of deltas.
Review the plan before authorizing the agent to write code. Confirm that it chose the correct donor and identified every unique layer property described in the model's paper or Hugging Face model card.
Implement
After you approve the plan, the agent scaffolds the architecture package from the donor, maps Hugging Face config keys to the MAX configuration classes, edits the graph to implement each delta, and writes weight adapters that translate checkpoint names to the slots the MAX graph expects.
Make sure the agent updates the copied docstrings and comments so they describe your model rather than retaining stale references to the donor.
Verify and validate
The agent runs linters and type checkers, serves the model locally to confirm the graph compiles and loads weights without orphan keys, then compares greedy token generation against the reference Hugging Face model.
Review the generated output and verification reports. Because the skill is continuously improving, it doesn't guarantee correctness out of the box. If you see gibberish or incoherent text, steer the agent to run a layer-by-layer divergence hunt, comparing intermediate outputs and weights against the Hugging Face reference until it isolates and resolves the exact point of divergence.
Next steps
Once the agent has created and verified your new model architecture, you can serve and deploy it:
- Serve custom model architectures:
Learn how to package and run your new custom model architecture using
max serve. - Model bring-up workflow: Read the detailed manual bring-up steps to better understand graph compilation, memory sizing, and weight remapping.
- Using AI coding assistants: Configure your development environment with rules and context files for general AI-assisted development.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!