Skip to main content

Write a PREreview

Safety Evaluation of Google’s Gemini Nano Banana Image Model under Adversarial and Realistic Prompt Conditions

Posted
Server
Preprints.org
DOI
10.20944/preprints202511.0211.v1

This study evaluates the safety performance of Google’s Gemini image generation system in realistic user conditions. Eight cases were tested using 24 prompt-response attempts that produced 21 images. The experiments covered single-turn prompts, multi-turn “circular prompting,” ambiguous inputs, and prompt-injection exploits. Outputs were classified by dual human review as Safe, Suggestive, or Explicit. Gemini generated unsafe content frequently. Out of 21 images, 19 contained unsafe material. Twelve were suggestive, and seven were explicit. Direct explicit prompts were consistently blocked, but multi-turn escalation and repeated injections bypassed moderation. Prompt injection succeeded in image mode but failed in text mode, showing a mode-specific weakness. Persistent injection attempts fully removed remaining safeguards, allowing the model to produce explicit imagery. The paper contributes a structured testing framework and evidence of reproducible failure patterns in multimodal safety enforcement. Results indicate the need for stronger context-aware moderation, durable safety states across sessions, explicit disambiguation for risky prompts, and robust injection-resistance mechanisms. Without these corrections, on-device and online deployments of Gemini or any untested image generation model remain vulnerable to systematic policy circumvention.

You can write a PREreview of Safety Evaluation of Google’s Gemini Nano Banana Image Model under Adversarial and Realistic Prompt Conditions. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now