
Threats and harms: Adversaries exploit vulnerabilities across both domains, and oftentimes, link content manipulation with technical exploits to achieve their objectives. A security attack, such as injecting malicious instructions or corrupting training data, often culminates in a safety failure, such as generating harmful content, leaking confidential information, or producing unwanted or harmful outputs, Chang stated. The AI Security and Safety Framework’s taxonomy brings these elements into a single structure that organizations can use to understand risk holistically and build defenses that address both the mechanism of attack and the resulting impact.
AI lifecycle: Vulnerabilities that are irrelevant during model development may become critical once the model gains access to tooling or interacts with other agents. The AI Security and Safety Framework follows the model across this entire journey, making it clear where different categories of risk emerge and how they may evolve, and letting organizations implement defense-in-depth strategies that account for how risks evolve as AI systems progress from development to production.
Multi-agent orchestration: The AI Security and Safety Framework can also account for the risks that emerge when AI systems work together, encompassing orchestration patterns, inter-agent communication protocols, shared memory architectures, and collaborative decision-making processes, Chang stated.
Multimodal threats: Threats can emerge from text prompts, audio commands, maliciously constructed images, manipulated video, corrupted code snippets, or even embedded signals in sensor data, Chang stated. As we continue to research how multimodal threats can manifest, treating these pathways consistently is essential, especially as organizations adopt multimodal systems in robotics and autonomous vehicle deployments, customer experience platforms, and real-time monitoring environments, Chang stated.
Audience-aware: Finally, the framework is intentionally designed for multiple audiences. Executives can operate at the level of attacker objectives, security leaders can focus on techniques, while engineers and researchers can dive deeper into sub techniques. Drilling down even further, AI red teams and threat intelligence teams can build, test, and evaluate procedures. All of these groups can share a single conceptual model, creating alignment that has been missing from the industry, Chang stated.
The framework includes the supporting infrastructure, complex supply chains, organizational policies, and human-in-the-loop interactions that collectively determine security outcomes. This enables clearer communication between AI developers, AI end-users, business functions, security practitioners, and governance and compliance entities, Chang stated.
