> For the complete documentation index, see [llms.txt](https://docs.pletor.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pletor.ai/model-library/image-models/gpt-image-2.md).

# GPT Image 2

## Overview

OpenAI's latest image model — the first to go toe-to-toe with Nano Banana 2 on quality, resolution, and speed. Where it pulls ahead: anything involving text, layout structure, or brand accuracy. Available in three quality tiers to match the job and the budget.

| Quality tier      | Low                                              | Medium                                           | High                                                         |
| ----------------- | ------------------------------------------------ | ------------------------------------------------ | ------------------------------------------------------------ |
| Best for          | High-volume iterations, drafts, batch generation | Production assets, static ads, UGC-style visuals | Editorial shoots, hero shots, complex compositions with text |
| Quality reference | On par with Nano Banana                          | On par with Nano Banana 2                        | On par with Nano Banana Pro                                  |
| Text rendering    | Clean                                            | Production-ready                                 | Production-ready                                             |

## Strengths for marketers

* **Text that works on the first try**: Headlines, CTAs, packaging fine print, dense paragraphs render correctly without Photoshop fixes.
* **Strong art direction**: Avoids the over-polished AI look. On par with Nano Banana Pro for editorial outputs.
* **Structured layouts**: Treats spatial instructions (grids, infographics, UI mockups) as rules, not suggestions.

### Ideal use cases

* **Complex-text product consistency**: Packaging, labels, and product copy rendered legibly across SKUs.
* **Product shots**: Production-quality hero shots from a product image and a scene description.
* **Editorial photos**: High-end fashion shoots, lifestyle moodboards, multi-frame e-commerce grids with consistent art direction.
* **Realistic UGC photos**: iPhone-style product-in-hand shots, ideal for Meta Ads or as first frames for UGC video.
* **Static ads**: Banner ads, social graphics, posters with readable headlines, CTAs, and fine print — composed in one pass.

#### Weaknesses

* **Resizing**: Tends to add elements that weren't in the original creative when resizing or adapting formats.
* **Fine product consistency**: Nano Banana Pro still ahead on micro-details when preserving an exact product across edits.
* **Content policy**: Not viable for underwear, swimwear, or lingerie brands.
* **Precise edits**: Nano Banana Pro is more precise for surgical edits.

***

## How to use effectively

GPT Image 2 rewards specific, structured prompts — especially when text or layout is involved.

#### Key principles

* Be explicit with copy: put exact text in quotes (`"Hold my hand!"`), specify where it appears (header, label, CTA).
* Specify layout when structure matters: grids, columns, hierarchy, spacing.
* For editorial work, treat prompts like art direction briefs: subject, environment, lighting, camera, atmosphere.
* Pick the right quality tier — Low for iteration and drafts, Medium for production assets, High for hero shots and dense text.
* For text-heavy assets, keep copy reasonable in length and break dense text into clear lines.
* Use reference images for product shots, but expect strict content policy filtering on sensitive verticals.

#### Effective prompting patterns

<details>

<summary>For editorial photos</summary>

`A photorealistic [shot type] of [subject], [action or expression], set in [environment]. Lighting: [description]. Camera: [lens, angle, framing]. Atmosphere: [reference style, color treatment, era]. Mood: [adjectives]. Composition emphasizes [key textures, details].`

</details>

<details>

<summary>For UGC-style photos</summary>

`Casual iPhone photograph of [person] holding [product] in [environment]. Natural lighting, slight motion blur, authentic feel — like a real customer post. Product label and branding clearly visible. Composition is unstaged and candid.`

</details>

<details>

<summary>For product shots with packaging</summary>

`Place the exact product shown in the reference image into the scene below. Preserve the [packaging element]'s typography, label layout, [color] color blocking, the "[wordmark]" exactly as shown. Scene: [environment]. Lighting: [description]. Aesthetic: [reference].`

</details>

<details>

<summary>For static ads with text</summary>

`A [format] ad for [brand]. Headline: "[exact headline copy]". CTA button: "[exact CTA]". Optional fine print: "[exact text]". Visual: [scene/subject]. Brand colors: [hex or name]. Typography should feel [adjective]. Layout: [hierarchy description].`

</details>

<details>

<summary>For website &#x26; UI mockups</summary>

`Create a [page type] mockup for [brand/product]. Sections: [hero, features, pricing...]. Copy: hero headline "[text]", subheadline "[text]", CTA "[text]". Visual style: [reference], color harmony focused on [color]. Layout should feel [adjective], with [grid/depth instructions].`

</details>

<details>

<summary>For complex-text consistency</summary>

`Generate [number] variations of [product] keeping all text elements identical: "[wordmark]", "[callout]", "[fine print]". Vary only [scene / lighting / angle]. Maintain font weight, kerning, and color across outputs.`

</details>

<details>

<summary>For multi-frame editorial grids</summary>

`Create a polished [N]x[N] grid of [shot type] for [brand]. No visible margins or gutters between frames. Brand should feel [adjectives]. Each frame: [subject and action]. Color harmony focused on [color]. Layout should feel intentional, with depth and contrast across frames.`

</details>

#### Example prompts

* **Editorial moodboard**: "Create a polished 3x3 grid of e-commerce photoshoots for a leather brand. Playful, design-forward, vibrant — Parisian fashion energy meets premium creative tech. Color harmony focused on orange. No visible margins. Each frame intentional, with contrast in body composition and lighting."
* **UGC iPhone shot**: "Casual iPhone selfie of a young woman holding \[product] at a sunlit café. Slightly blurred, authentic, real — feels like a customer post. Product label clearly readable. Natural composition, no staging."
* **Static ad with copy**: "Square Instagram ad. Headline: 'Soft hands, every day.' CTA: 'Shop now.' Pink and white color blocking. Hand cream tube on a folded white waffle towel, polished concrete background. Spa hotel aesthetic."
* **UI mockup**: "Landing page hero for a leather goods brand. Headline: 'Crafted in Paris, made for daily.' Subhead: 'Handmade leather bags, designed to last.' CTA: 'Shop the collection.' Hero image: orange leather tote on neutral background. Layout feels editorial, with generous whitespace."
* **Packaging consistency**: "Place the exact tube from the reference image on a polished concrete vanity, beside a printed card reading 'for our guest'. Preserve the typography, the 'Hold my hand!' callout, the pink screw cap exactly as shown. Beige linen background, brushed nickel fittings."
* **Multi-page e-commerce mockup**: "Multi-page e-commerce mockup for a leather brand. Playful, design-forward, vibrant. High-end grid with depth and intentionality. Color harmony: orange."


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pletor.ai/model-library/image-models/gpt-image-2.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.