> For the complete documentation index, see [llms.txt](https://docs.modular.com/llms.txt).
> Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

# Image generation

With MAX, you can run open-source image generation models locally and access
them through a REST API. This page explains how to use the
[`v1/responses`](https://docs.modular.com/max/rest-api.md#POST/v1/responses) endpoint to generate
images from text prompts or transform existing images, with examples for each
input type.

## Endpoint

The MAX [`v1/responses`](https://docs.modular.com/max/rest-api.md#POST/v1/responses) endpoint
provides a unified interface for diverse AI tasks including image generation,
with structured input and output handling. It's built on [Open
Responses](https://huggingface.co/blog/open-responses), an open-source
initiative to create a standardized, provider-agnostic API specification that
works across different AI providers and model backends.

### Text input

For text-to-image generation, set `input` to a plain string describing
the image you want. The model returns the generated image as base64-encoded
data in `output[0].content[0].image_data`:

**Python:**

```python
response = client.responses.create(
    model="black-forest-labs/FLUX.2-dev",
    input="Your text prompt here",
    extra_body={
        "provider_options": {
            "image": {"height": 1024, "width": 1024, "steps": 28}
        }
    }
)

image_data = response.output[0].content[0].image_data
```

---

**curl:**

```bash
curl -X POST http://localhost:8000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "black-forest-labs/FLUX.2-dev",
    "input": "Your text prompt here",
    "provider_options": {
      "image": {"height": 1024, "width": 1024, "steps": 28}
    }
  }'
```

### Image URL input

For image-to-image workflows, set `input` to a structured message array
containing the source image URL and a text prompt describing the
transformation. The `type` field distinguishes image and text content
within the same message:

**Python:**

```python
response = client.responses.create(
    model="black-forest-labs/FLUX.2-dev",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_image",
                    "image_url": "https://example.com/input.png"
                },
                {
                    "type": "input_text",
                    "text": "Your transformation prompt"
                }
            ]
        }
    ],
    extra_body={
        "provider_options": {
            "image": {"height": 1024, "width": 1024, "steps": 28}
        }
    }
)

image_data = response.output[0].content[0].image_data
```

---

**curl:**

```bash
curl -X POST http://localhost:8000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "black-forest-labs/FLUX.2-dev",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_image",
            "image_url": "https://example.com/input.png"
          },
          {
            "type": "input_text",
            "text": "Your transformation prompt"
          }
        ]
      }
    ],
    "provider_options": {
      "image": {"height": 1024, "width": 1024, "steps": 28}
    }
  }'
```

### Local file input

Local files must be base64-encoded and passed as a data URI in the `image_url`
field using the format `data:<mime-type>;base64,<data>`.

**Python:**

```python
import base64

with open("/path/to/image.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

response = client.responses.create(
    model="black-forest-labs/FLUX.2-dev",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_image",
                    "image_url": f"data:image/png;base64,{image_base64}"
                },
                {
                    "type": "input_text",
                    "text": "Your transformation prompt"
                }
            ]
        }
    ],
    extra_body={
        "provider_options": {
            "image": {"height": 1024, "width": 1024, "steps": 28}
        }
    }
)

image_data = response.output[0].content[0].image_data
```

---

**curl:**

```bash
IMAGE_DATA=$(base64 -w 0 /path/to/image.png)

cat <

Then, create a client and make a request to the model:

```python title="generate-image.py"
import base64
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

response = client.responses.create(
    model="black-forest-labs/FLUX.2-dev",
    input="A serene mountain landscape at sunset",
    extra_body={
        "provider_options": {
            "image": {"height": 512, "width": 512, "steps": 28}
        }
    }
)

image_data = response.output[0].content[0].image_data
with open("output-text-to-image.png", "wb") as f:
    f.write(base64.b64decode(image_data))
```

Run the script to generate the image:

```bash
python generate-image.py
```

The model saves the generated image to `output-text-to-image.png` in your
current directory.

---

**curl:**

Send a request to the `v1/responses` endpoint and decode the base64-encoded
image data from the response:

```bash
curl -X POST http://localhost:8000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "black-forest-labs/FLUX.2-dev",
    "input": "A serene mountain landscape at sunset",
    "provider_options": {
      "image": {"height": 512, "width": 512, "steps": 28}
    }
  }' | jq -r '.output[0].content[0].image_data' | base64 -d > output-text-to-image.png
```

This sends a text prompt to the model and decodes the base64-encoded image
data from the response into `output-text-to-image.png`.

Your output should look similar to the following:

<figure>
  <img src={require('./images/image-generation/output-text-to-image.png').default}
       alt="A serene mountain landscape at sunset generated by FLUX.2-dev" width="250" />
  <figcaption>**Figure 1.** Text-to-image output: a serene mountain landscape at sunset.</figcaption>
</figure>

### Use your generated image as input

You can then take the image generated in the previous step and make additional
customizations with the image-to-image workflow by providing both an image and a
text prompt:

**Python:**

Read and encode the output image from the previous step, then send it along
with a text prompt to the model:

```python title="generate-image-to-image.py"
import base64
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

with open("output-text-to-image.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

response = client.responses.create(
    model="black-forest-labs/FLUX.2-dev",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_image",
                    "image_url": f"data:image/png;base64,{image_base64}"
                },
                {
                    "type": "input_text",
                    "text": "Transform this into a watercolor painting"
                }
            ]
        }
    ],
    extra_body={
        "provider_options": {
            "image": {"height": 512, "width": 512, "steps": 28}
        }
    }
)

image_data = response.output[0].content[0].image_data
with open("output-image-to-image.png", "wb") as f:
    f.write(base64.b64decode(image_data))
```

Run the script to generate the image:

```bash
python generate-image-to-image.py
```

The model saves the transformed image to `output-image-to-image.png` in your
current directory.

---

**curl:**

First, encode the output image to base64 format:

```bash
IMAGE_BASE64=$(base64 -w 0 /path/to/image-generation-quickstart/output-text-to-image.png)
```

The base64 string is extremely large. If you include it directly in the curl
command, it will exceed the Linux argument size limit. Instead, store the
request payload in a JSON file:

```bash
cat <
  <figcaption>**Figure 2.** Image-to-image output: the mountain landscape transformed into a watercolor painting.</figcaption>
</figure>

## Next steps

Now that you can generate images, explore other inference capabilities and
deployment options.

  
  
</ListingCards>
