Picadabra

What Is Banana Prompting

This skill is for image work where prompt quality determines the result more than any single parameter tweak.

Use it when you need:

precise subject and composition control
better text rendering or label-heavy images
reference-image editing with clear intent
prompt structures that are reusable across iterations

Treat Banana as a strong image direction model, not a magic box. The job is to describe the visual decision clearly enough that the model can execute it.

Strategy

1. Research before you write prompts

Before inventing a prompt from scratch:

pica prompt find "your topic"
pica skill find "your use case"

Use pica prompt find to look for proven prompt structures
Use pica skill find when the task is a recognizable domain such as infographics, brand assets, posters, or product photography

If a specialized skill exists, use it instead of forcing one generic prompt to do everything.

2. Confirm the model family

Do not guess the current Banana model ID:

pica model search "banana image"
pica model info <model-id>

Use pica model info to confirm:

whether the model is text-to-image or edit/image-to-image
the current input field names
size or aspect-ratio expectations
whether negative prompts or multiple images are supported

If the task requires heavy text rendering, also compare with Seedream-style models before committing.

3. Start with a visual decision, not adjectives

Weak prompts are vague style soup. Strong prompts make explicit decisions about:

subject
action
environment
composition
lighting
mood
output purpose

Bad:

a nice modern image for a startup

Better:

Founder portrait in a quiet studio, half-body framing, direct eye contact,
soft side lighting, neutral gray backdrop, editorial photography, calm and
competent mood, shallow depth of field

4. Prefer layered prompts over keyword piles

Use a predictable structure:

[Subject + Action] +
[Style / medium] +
[Environment] +
[Composition / camera] +
[Lighting / mood] +
[Critical constraints]

Example:

Single ceramic coffee cup on a travertine table, premium product photography,
sunlit cafe interior, centered three-quarter angle, soft morning light with
gentle shadows, minimal luxury mood, clean background, no extra objects

5. Use structure when the scene is dense

For complex scenes, dense layouts, or multi-part illustrations, switch from a loose sentence to a structured prompt.

A practical pattern is:

{
	"objective": "What the image must communicate",
	"subject": "Main entity or entities",
	"environment": "Setting and world details",
	"composition": "Framing, perspective, hierarchy",
	"lighting": "Time of day, light quality, shadows",
	"style": "Medium or visual language",
	"constraints": "Must-have and must-avoid details"
}

This is especially useful for:

infographics
posters
UI-style diagrams
editorial scenes with many interacting elements

6. Iterate with intent

Do not change five things at once. After each result, decide which layer failed:

subject wrong -> rewrite subject/action
framing wrong -> rewrite composition/camera
mood wrong -> rewrite lighting/color language
too generic -> replace abstract adjectives with concrete art direction
text or labels wrong -> rewrite exact strings and hierarchy

When the failure is structural, rewrite the prompt architecture. When the failure is small, keep the prompt stable and edit one constraint.

Prompt Patterns

Product / object shot

Single [object] on [surface], premium product photography, [environment],
[camera angle], [lighting], clean composition, commercial quality, no extra props

Portrait

[person description], editorial portrait photography, [framing], [background],
[lighting], [expression], sharp focus, natural skin texture

Stylized illustration

[subject], [illustration style], [palette], [composition], [background treatment],
clear focal point, polished finish

Label-heavy educational image

Detailed educational graphic with title at top: [exact title].
Layout: [overall structure].
Sections: [section-by-section content].
Labels: [exact label text].
Connectors: [arrows / callouts / relationships].
Style: [visual treatment].

For label-heavy work, precision beats brevity.

Composition Vocabulary

Use camera and layout language the model can act on:

close-up, half-body, wide shot, bird's-eye view
centered composition, rule of thirds, symmetrical layout
shallow depth of field, deep focus, negative space
poster layout, comparison columns, vertical timeline, diagrammatic callouts

Avoid aesthetic filler like very cool, beautiful, or amazing. Those words do not create composition.

Text Rendering Guidance

Banana-class models can do text surprisingly well, but only when the text job is specified clearly.

Rules:

include exact text for titles, labels, or quotes
keep each text fragment short
define where the text belongs
separate title, labels, and quotes by function

Example:

Title at top: AI时代创业者的稀缺品质.
Left label: 艺术家型 Artist Type.
Right label: 正常型 Normal Type.
Bottom quote banner: 没有乔布斯的命却得了乔布斯的病.

If the task depends on perfect brand wordmarks or long copy blocks, compare a text-specialized image model before final delivery.

Reference-Driven Work

When the model supports image input, use the reference for identity, product shape, or composition anchor. Then use the prompt to specify the change:

keep identity, change wardrobe
keep product, change environment
keep layout, add labels
keep style, alter pose or camera distance

In edit mode, prompt the delta, not the whole image description.

Common Failure Modes

Too vague: the prompt names a theme, not a scene
Too many conflicting styles: the image has no coherent visual language
Missing composition: the model improvises a mediocre layout
Treating infographics like illustrations: no hierarchy, no labels, no argument
Overwriting everything during iteration: no stable baseline for comparison

Workflow Example

pica model search "banana image"
pica model info <model-id>
pica prompt find "editorial portrait"
pica generate \
  --model <model-id> \
  --kind image_generation \
  --input '{
    "prompt": "Founder portrait in a quiet studio, half-body framing, soft side lighting, editorial photography, neutral gray backdrop, calm and competent mood"
  }'

Related Skills

Search pica skill find "infographic" when the image is information-dense
Search pica skill find "brand kit" when the task is logo or brand-system creation