AI

    Prompt Injection Attack Uses Images to Trick AI Models

    Prompt injection attack hides malicious instructions in downscaled images, tricking AI models into silent data leaks without user consent.
    Prompt-Injection-1

    Prompt injection just found a new disguise image. Researchers at Trail of Bits have discovered a stealthy method that hides malicious prompts inside high-resolution pictures. When AI systems downscale these images, the model suddenly sees and acts on hidden instructions without the user realizing.

    Prompt-Injection-2

    When someone uploads an image to a multimodal AI, the platform often resizes it automatically. That’s where the problem starts. Attackers can design images so that key text appears only after the system downsizes it using interpolation methods like bilinear or bicubic.

    As a result, the AI model ends up reading a prompt it was never supposed to see. The user, meanwhile, sees nothing but a clean image.

    Netflix Gen AI Rules Aim to Keep AI Use in Check

    Trail of Bits showed how one image could quietly direct Gemini CLI to leak Google Calendar data. They used Zapier with trust=True, allowing the AI to execute tool calls without asking the user first. Because the malicious text appeared only after rescaling, no one saw it coming.

    From the AI’s perspective, it simply followed the full prompt, blending visible and hidden parts into one request.

    The attack worked across multiple systems using Google’s Gemini models:

    • Google Gemini CLI
    • Vertex AI Studio
    • Gemini’s web interface
    • Gemini API via the llm CLI
    • Google Assistant on Android
    • Genspark

    Even though the researchers only tested these platforms, others that rely on automatic image downscaling could face similar risks.

    To prove the concept, Trail of Bits released a tool called Anamorpher. It generates images that exploit different rescaling algorithms, effectively turning them into prompt injection payloads. With this, security teams (and unfortunately, attackers) can simulate the same behavior across various systems.

    According to the researchers, here’s how developers can start locking things down:

    • Set strict upload dimensions to avoid forced downscaling
    • Show users a preview of what the model will see
    • Ask for confirmation before executing commands triggered by image content
    • Design systems with broader defenses against multi-modal prompt injection

    Clearly, prompt injection has evolved. It’s no longer just a text-based problem; it’s visual now, too. If developers want safe AI, they’ll need to secure every input, not just the obvious ones.

    No comments yet Write the First Comment
    ×

    Your comment has been submitted,
    it will be published after approval.

    Write a Comment