Banana Prompting
Installs
88
Category
image
What Is Banana Prompting
This skill is for image work where prompt quality determines the result more than any single parameter tweak.
Use it when you need:
- precise subject and composition control
- better text rendering or label-heavy images
- reference-image editing with clear intent
- prompt structures that are reusable across iterations
Treat Banana as a strong image direction model, not a magic box. The job is to describe the visual decision clearly enough that the model can execute it.
Strategy
1. Research before you write prompts
Before inventing a prompt from scratch:
pica prompt find "your topic"
pica skill find "your use case"
- Use
pica prompt findto look for proven prompt structures - Use
pica skill findwhen the task is a recognizable domain such as infographics, brand assets, posters, or product photography
If a specialized skill exists, use it instead of forcing one generic prompt to do everything.
2. Confirm the model family
Do not guess the current Banana model ID:
pica model search "banana image"
pica model info <model-id>
Use pica model info to confirm:
- whether the model is text-to-image or edit/image-to-image
- the current input field names
- size or aspect-ratio expectations
- whether negative prompts or multiple images are supported
If the task requires heavy text rendering, also compare with Seedream-style models before committing.
3. Start with a visual decision, not adjectives
Weak prompts are vague style soup. Strong prompts make explicit decisions about:
- subject
- action
- environment
- composition
- lighting
- mood
- output purpose
Bad:
a nice modern image for a startup
Better:
Founder portrait in a quiet studio, half-body framing, direct eye contact,
soft side lighting, neutral gray backdrop, editorial photography, calm and
competent mood, shallow depth of field
4. Prefer layered prompts over keyword piles
Use a predictable structure:
[Subject + Action] +
[Style / medium] +
[Environment] +
[Composition / camera] +
[Lighting / mood] +
[Critical constraints]
Example:
Single ceramic coffee cup on a travertine table, premium product photography,
sunlit cafe interior, centered three-quarter angle, soft morning light with
gentle shadows, minimal luxury mood, clean background, no extra objects
5. Use structure when the scene is dense
For complex scenes, dense layouts, or multi-part illustrations, switch from a loose sentence to a structured prompt.
A practical pattern is:
{
"objective": "What the image must communicate",
"subject": "Main entity or entities",
"environment": "Setting and world details",
"composition": "Framing, perspective, hierarchy",
"lighting": "Time of day, light quality, shadows",
"style": "Medium or visual language",
"constraints": "Must-have and must-avoid details"
}
This is especially useful for:
- infographics
- posters
- UI-style diagrams
- editorial scenes with many interacting elements
6. Iterate with intent
Do not change five things at once. After each result, decide which layer failed:
- subject wrong -> rewrite subject/action
- framing wrong -> rewrite composition/camera
- mood wrong -> rewrite lighting/color language
- too generic -> replace abstract adjectives with concrete art direction
- text or labels wrong -> rewrite exact strings and hierarchy
When the failure is structural, rewrite the prompt architecture. When the failure is small, keep the prompt stable and edit one constraint.
Prompt Patterns
Product / object shot
Single [object] on [surface], premium product photography, [environment],
[camera angle], [lighting], clean composition, commercial quality, no extra props
Portrait
[person description], editorial portrait photography, [framing], [background],
[lighting], [expression], sharp focus, natural skin texture
Stylized illustration
[subject], [illustration style], [palette], [composition], [background treatment],
clear focal point, polished finish
Label-heavy educational image
Detailed educational graphic with title at top: [exact title].
Layout: [overall structure].
Sections: [section-by-section content].
Labels: [exact label text].
Connectors: [arrows / callouts / relationships].
Style: [visual treatment].
For label-heavy work, precision beats brevity.
Composition Vocabulary
Use camera and layout language the model can act on:
close-up,half-body,wide shot,bird's-eye viewcentered composition,rule of thirds,symmetrical layoutshallow depth of field,deep focus,negative spaceposter layout,comparison columns,vertical timeline,diagrammatic callouts
Avoid aesthetic filler like very cool, beautiful, or amazing. Those words do not create composition.
Text Rendering Guidance
Banana-class models can do text surprisingly well, but only when the text job is specified clearly.
Rules:
- include exact text for titles, labels, or quotes
- keep each text fragment short
- define where the text belongs
- separate title, labels, and quotes by function
Example:
Title at top: AI时代创业者的稀缺品质.
Left label: 艺术家型 Artist Type.
Right label: 正常型 Normal Type.
Bottom quote banner: 没有乔布斯的命却得了乔布斯的病.
If the task depends on perfect brand wordmarks or long copy blocks, compare a text-specialized image model before final delivery.
Reference-Driven Work
When the model supports image input, use the reference for identity, product shape, or composition anchor. Then use the prompt to specify the change:
- keep identity, change wardrobe
- keep product, change environment
- keep layout, add labels
- keep style, alter pose or camera distance
In edit mode, prompt the delta, not the whole image description.
Common Failure Modes
- Too vague: the prompt names a theme, not a scene
- Too many conflicting styles: the image has no coherent visual language
- Missing composition: the model improvises a mediocre layout
- Treating infographics like illustrations: no hierarchy, no labels, no argument
- Overwriting everything during iteration: no stable baseline for comparison
Workflow Example
pica model search "banana image"
pica model info <model-id>
pica prompt find "editorial portrait"
pica generate \
--model <model-id> \
--kind image_generation \
--input '{
"prompt": "Founder portrait in a quiet studio, half-body framing, soft side lighting, editorial photography, neutral gray backdrop, calm and competent mood"
}'
Related Skills
- Search
pica skill find "infographic"when the image is information-dense - Search
pica skill find "brand kit"when the task is logo or brand-system creation