Corporation Consumer Carrier Home and Enterprise

Language

VOTI: Jailbreaking Vision-Language Models via Visual Obfusca-tion and Task Induction

Release Date：2025-10-10 Author：ZHU Yifan, CHU Zhixuan, REN Kui

Abstract: In recent years, large vision-language models (VLMs) have achieved significant breakthroughs in cross-modal understanding and generation. However, the safety issues arising from their multimodal interactions become prominent. VLMs are vulnerable to jailbreak attacks, where attackers craft carefully designed prompts to bypass safety mechanisms, leading them to generate harmful content. To address this, we investigate the alignment between visual inputs and task execution, uncovering locality defects and attention biases in VLMs. Based on these findings, we propose VOTI, a novel jailbreak framework leveraging visual obfuscation and task induction. VOTI subtly embeds malicious keywords within neutral image layouts to evade detection, and breaks down harmful queries into a sequence of subtasks. This approach disperses malicious intent across modalities, exploiting VLMs’ over-reliance on local visual cues and their fragility in multi-step reasoning to bypass global safety mechanisms. Implemented as an automated framework, VOTI integrates large language models as red team assistants to generate and iteratively optimize jailbreak strategies. Extensive experiments across seven mainstream VLMs demonstrate VOTI’s effectiveness, achieving a 73.46% attack success rate on GPT-4o-0513. These results reveal critical vulnerabilities in VLMs, highlighting the urgent need for improving robust defenses and multimodal alignment.

Keywords: large vision-language models; jailbreak attacks; red teaming; security of large models; safety alignment

download： PDF

Related Articles

Key Techniques and Challenges in NeRF-Based Dynamic 3D Reconstruction

Analysis of Feasible Solutions for Railway 5G Network Security Assessment

StegoAgent: A Generative Steganography Framework Based on GUI Agents

Dataset Copyright Auditing for Large Models: Fundamentals, Open Problems, and Future Directions

From Function Calls to MCPs for Securing AI Agent Systems: Architecture, Challenges and Countermeasures

Poison-Only and Targeted Backdoor Attack Against Visual Object Tracking

Special Topic on Security of Large Models