Human-in-the-loop isn’t enough: New attack turns AI safeguards into exploits

by admin December 18, 2025

written by admin December 18, 2025

CheckMarx demonstrated that attackers can manipulate these dialogs by hiding or misrepresenting malicious instructions, like padding payloads with benign-looking text, pushing dangerous commands out of the visible view, or crafting prompts that cause the AI to generate misleading summaries of what will actually execute.

In terminal-style interfaces, especially, long or formatted outputs make this kind of deception easy to miss. Since many AI agents operate with elevated privileges, a single misled approval can translate directly into code execution, running OS commands, file system access, or downstream compromise, according to CheckMarx findings.

Beyond padding or truncation, the researchers also described other dialog-forging techniques that abuse how confirmation is rendered. By leveraging Markdown rendering and layout behaviors, attackers can visually separate benign text from hidden commands or manipulate summaries so the human-visible description isn’t malicious.

Source link

Human-in-the-loop isn’t enough: New attack turns AI safeguards into exploits

North Korea-Linked Hackers Steal $2.02 Billion in 2025, Leading Global Crypto Theft

Google releases fast AI model Gemini 3 Flash – Computerworld

Related Articles

Leave a Comment Cancel Reply