Nano Banana (+ π Pro)
Revolutionary model that combines generation and editing with exceptional natural language understanding.
TL;DR
Revolutionary model that combines generation and editing with exceptional natural language understanding. The model is available in two versions: Standard (Nano Banana) & Pro.
Best for
Quick cost-effective edits, image fusion, character consistency
Static ads, layout variants, product renders, high-res assets
Product rendering
Good
Stellar
Typography
Basic
Clean, ad-ready
Resolution
1080p
Up to 4K (configurable 1Kβ4K)
Strengths for marketers
Simple, natural language editing: Make complex edits with simple conversational prompts like "enhance this photo" or "change the background."
Multi-image fusion: Seamlessly blend multiple images into cohesive new visuals for product placement and scene creation.
Character consistency: Maintain the same person or mascot across different scenes and environments.
Ideal use cases
Product placements: Seamlessly insert products into lifestyle scenes and environments.
Video ads first frame: Create compelling opening frames for video ads and social media content.
Turn photos into ads: Transform existing photos into polished advertising materials with text and branding.
Character/mascot campaigns: Consistent brand characters across multiple marketing materials.
Multi-step editing: Complex photo enhancements that would require multiple traditional editing steps.
Static ad creation (Pro): Generate polished ads with clean typography and multiple layout options.
Layout variations (Pro): Generate an asset's channel variants in seconds, no rework needed.
Weaknesses
Variance in image processing - output quality can be inconsistent across different inputs
Sensitivity to text prompts - when used with reference images, requires precise wording to achieve desired results consistently
Limited aspect ratio control based on input images rather than user specification
How to use effectively
Nano Banana excels with natural, conversational prompts:
Key principles
Describe the scene, don't list keywords
Use natural language as if talking to a skilled photo editor
Leverage multi-image inputs for fusion and reference
Use preservation instructions: be specific about what you want changed or created (e.g., "identical to the original")
Take advantage of world knowledge for contextual accuracy
Effective prompting patterns
Example prompts
Simple edits: "Remove the person in the background" or "Make this photo brighter"
Reference image integration: "Place the man from Image 2 next to the woman in Image 1. They are seated together, sharing a laugh as they look at a tablet. Keep the ambient lighting and depth of field from Picture 1. Maintain skin tones consistent with the original scene."
Image fusion: "Place this product in that lifestyle setting"
Complex transformations: "Turn this into black-and-white manga style"
Multi-step edits: "Remove everything except the woman and mic", then "make her a 3D character in an office"
Multi-references: "Make employee badges using this design template and this photo"
Control aspect ratio: "Create a 9:16 Instagram Story version of this ad with neon background and space at top for text."
Product photography: "Replace the black bottle with the green 'HULK' can from Image 2. Match hand pose, reflections, and metal specular highlights. Keep label readable and preserve text legibility; no stylization."
Model parameters
Inputs accepted
Text
Text + Multiple reference images
Output characteristics
Default Resolution: 1080p, Up to 4K with Nano Banana Pro.
Available Aspect Ratios: Auto:
driven by text prompt when used as text-to-image
driven by reference(s) images when used as image-to-image
Last updated

