> For the complete documentation index, see [llms.txt](https://docs.modular.com/llms.txt). > Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md). # Image generation With MAX, you can run open-source image generation models locally and access them through a REST API. This page explains how to use the [`v1/responses`](https://docs.modular.com/max/rest-api.md#POST/v1/responses) endpoint to generate images from text prompts or transform existing images, with examples for each input type. ## Endpoint The MAX [`v1/responses`](https://docs.modular.com/max/rest-api.md#POST/v1/responses) endpoint provides a unified interface for diverse AI tasks including image generation, with structured input and output handling. It's built on [Open Responses](https://huggingface.co/blog/open-responses), an open-source initiative to create a standardized, provider-agnostic API specification that works across different AI providers and model backends. ### Text input For text-to-image generation, set `input` to a plain string describing the image you want. The model returns the generated image as base64-encoded data in `output[0].content[0].image_data`: **Python:** ```python response = client.responses.create( model="black-forest-labs/FLUX.2-dev", input="Your text prompt here", extra_body={ "provider_options": { "image": {"height": 1024, "width": 1024, "steps": 28} } } ) image_data = response.output[0].content[0].image_data ``` --- **curl:** ```bash curl -X POST http://localhost:8000/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "black-forest-labs/FLUX.2-dev", "input": "Your text prompt here", "provider_options": { "image": {"height": 1024, "width": 1024, "steps": 28} } }' ``` ### Image URL input For image-to-image workflows, set `input` to a structured message array containing the source image URL and a text prompt describing the transformation. The `type` field distinguishes image and text content within the same message: **Python:** ```python response = client.responses.create( model="black-forest-labs/FLUX.2-dev", input=[ { "role": "user", "content": [ { "type": "input_image", "image_url": "https://example.com/input.png" }, { "type": "input_text", "text": "Your transformation prompt" } ] } ], extra_body={ "provider_options": { "image": {"height": 1024, "width": 1024, "steps": 28} } } ) image_data = response.output[0].content[0].image_data ``` --- **curl:** ```bash curl -X POST http://localhost:8000/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "black-forest-labs/FLUX.2-dev", "input": [ { "role": "user", "content": [ { "type": "input_image", "image_url": "https://example.com/input.png" }, { "type": "input_text", "text": "Your transformation prompt" } ] } ], "provider_options": { "image": {"height": 1024, "width": 1024, "steps": 28} } }' ``` ### Local file input Local files must be base64-encoded and passed as a data URI in the `image_url` field using the format `data:;base64,`. **Python:** ```python import base64 with open("/path/to/image.png", "rb") as f: image_base64 = base64.b64encode(f.read()).decode("utf-8") response = client.responses.create( model="black-forest-labs/FLUX.2-dev", input=[ { "role": "user", "content": [ { "type": "input_image", "image_url": f"data:image/png;base64,{image_base64}" }, { "type": "input_text", "text": "Your transformation prompt" } ] } ], extra_body={ "provider_options": { "image": {"height": 1024, "width": 1024, "steps": 28} } } ) image_data = response.output[0].content[0].image_data ``` --- **curl:** ```bash IMAGE_DATA=$(base64 -w 0 /path/to/image.png) cat < Then, create a client and make a request to the model: ```python title="generate-image.py" import base64 from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY") response = client.responses.create( model="black-forest-labs/FLUX.2-dev", input="A serene mountain landscape at sunset", extra_body={ "provider_options": { "image": {"height": 512, "width": 512, "steps": 28} } } ) image_data = response.output[0].content[0].image_data with open("output-text-to-image.png", "wb") as f: f.write(base64.b64decode(image_data)) ``` Run the script to generate the image: ```bash python generate-image.py ``` The model saves the generated image to `output-text-to-image.png` in your current directory. --- **curl:** Send a request to the `v1/responses` endpoint and decode the base64-encoded image data from the response: ```bash curl -X POST http://localhost:8000/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "black-forest-labs/FLUX.2-dev", "input": "A serene mountain landscape at sunset", "provider_options": { "image": {"height": 512, "width": 512, "steps": 28} } }' | jq -r '.output[0].content[0].image_data' | base64 -d > output-text-to-image.png ``` This sends a text prompt to the model and decodes the base64-encoded image data from the response into `output-text-to-image.png`. Your output should look similar to the following:

A serene mountain landscape at sunset generated by FLUX.2-dev — **Figure 1.** Text-to-image output: a serene mountain landscape at sunset.

### Use your generated image as input You can then take the image generated in the previous step and make additional customizations with the image-to-image workflow by providing both an image and a text prompt: **Python:** Read and encode the output image from the previous step, then send it along with a text prompt to the model: ```python title="generate-image-to-image.py" import base64 from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY") with open("output-text-to-image.png", "rb") as f: image_base64 = base64.b64encode(f.read()).decode("utf-8") response = client.responses.create( model="black-forest-labs/FLUX.2-dev", input=[ { "role": "user", "content": [ { "type": "input_image", "image_url": f"data:image/png;base64,{image_base64}" }, { "type": "input_text", "text": "Transform this into a watercolor painting" } ] } ], extra_body={ "provider_options": { "image": {"height": 512, "width": 512, "steps": 28} } } ) image_data = response.output[0].content[0].image_data with open("output-image-to-image.png", "wb") as f: f.write(base64.b64decode(image_data)) ``` Run the script to generate the image: ```bash python generate-image-to-image.py ``` The model saves the transformed image to `output-image-to-image.png` in your current directory. --- **curl:** First, encode the output image to base64 format: ```bash IMAGE_BASE64=$(base64 -w 0 /path/to/image-generation-quickstart/output-text-to-image.png) ``` The base64 string is extremely large. If you include it directly in the curl command, it will exceed the Linux argument size limit. Instead, store the request payload in a JSON file: ```bash cat <

**Figure 2.** Image-to-image output: the mountain landscape transformed into a watercolor painting.

## Next steps Now that you can generate images, explore other inference capabilities and deployment options.