The rapid evolution and enterprise adoption of AI has motivated bad actors to target these systems with greater frequency and sophistication. Many security leaders recognize the importance and urgency of AI security, but don’t yet have processes in place to effectively manage and mitigate emerging AI risks with comprehensive coverage of the entire adversarial AI threat landscape.
Robust Intelligence (now a part of Cisco) and the UK AI Security Institute partnered with the National Institute of Standards and Technology (NIST) to release the latest update to the Adversarial Machine Learning Taxonomy. This transatlantic partnership aimed to fill this need for a comprehensive adversarial AI threat landscape, while creating alignment across regions in standardizing an approach to understanding and mitigating adversarial AI.
Survey results from the Global Cybersecurity Outlook 2025 published by the World Economic Forum highlight the gap between AI adoption and preparedness: “While 66% of organizations expect AI to have the most significant impact on cybersecurity in the year to come, only 37% report having processes in place to assess the security of AI tools before deployment.”
In order to successfully mitigate these attacks, it’s imperative that AI and cybersecurity communities are well informed about today’s AI security challenges. To that end, we’ve co-authored the 2025 update to NIST’s taxonomy and terminology of adversarial machine learning.
Let’s take a glance at what’s new in this latest update to the publication, walk through the taxonomies of attacks and mitigations at a high level, and then briefly reflect on the purpose of taxonomies themselves—what are they for, and why are they so useful?
What’s new?
The previous iteration of the NIST Adversarial Machine Learning Taxonomy focused on predictive AI, models designed to make accurate predictions based on historical data patterns. Individual adversarial techniques were grouped into three primary attacker objectives: availability breakdown, integrity violations, and privacy compromise. It also included a preliminary AI attacker technique landscape for generative AI, models that generate new content based on existing data. Generative AI adopted all three adversarial technique groups and added misuse violations as an additional category.
In the latest update of the taxonomy, we expand on the generative AI adversarial techniques and violations section, while also ensuring the predictive AI section remains accurate and relevant to today’s adversarial AI landscape. One of the major updates to the latest version is the addition of an index of techniques and violations at the beginning of the document. Not only does this make the taxonomy easier to navigate, but it allows for an easier way to reference techniques and violations in external references to the taxonomy. This makes the taxonomy a more practical resource to AI security practitioners.
Clarifying attacks on Predictive AI models
The three attacker objectives consistent across predictive and generative AI sections, are as follows:
- Availability breakdown attacks degrade the performance and availability of a model for its users.
- Integrity violations attempt to undermine model integrity and generate incorrect outputs.
- Privacy compromises unintended leakage of restricted or proprietary information such as information about the underlying model and training data.

Classifying attacks on Generative AI models
The generative AI taxonomy inherits the same three attacker objectives as predictive AI—availability, integrity, and privacy—and encapsulates additional individual techniques. There’s a fourth attacker objective unique to generative AI: misuse violations. The updated version of the taxonomy expanded on generative AI adversarial techniques to account for the most up-to-date landscape of attacker techniques.
Misuse violations repurpose the capabilities of generative AI to further an adversary’s malicious objectives by creating harmful content that supports cyber-attack initiatives.
Harms associated with misuse violations are intended to produce outputs that could cause harm to others. For example, attackers could use direct prompting attacks to bypass model defenses and produce harmful or undesirable output.

To achieve one or several of these goals, adversaries can leverage a number of techniques. The expansion of the generative AI section highlights attacker techniques unique to generative AI, such as direct prompt injection, data extraction, and indirect prompt injection. In addition, there is an entirely new arsenal of supply chain attacks. Supply chain attacks are not a violation specific to a model, and therefore are not included in the above taxonomy diagram.
Supply chain attacks are rooted in the complexity and inherited risk of the AI supply chain. Every component—open-source models and third-party data, for example—can introduce security issues into the entire system.
These can be mitigated with supply chain assurance practices such as vulnerability scanning and validation of datasets.
Direct prompt injection alters the behavior of a model through direct input from an adversary. This can be done to create intentionally malicious content or for sensitive data extraction.
Mitigation measures include training for alignment and deploying a real-time prompt injection detection solution for added security.
Indirect prompt injection differs in that adversarial inputs are delivered via a third-party channel. This technique can help further several objectives: manipulation of information, data extraction, unauthorized disclosure, fraud, malware distribution, and more.
Proposed mitigations help minimize risk through reinforcement learning from human feedback, input filtering, and the use of an LLM moderator or interpretability-based solution.
What are taxonomies for, anyways?
Co-author and Cisco Director of AI & Security, Hyrum Anderson, put it best when he said that “taxonomies are most obviously important to organize our understanding of attack methods, capabilities, and objectives. They also have a long tail effect in improving communication and collaboration in a field that’s moving very quickly.”
It’s why Cisco strives to aid in the creation and continuous improvement of shared standards, collaborating with leading organizations like NIST and the UK AI Security Institute.
These resources give us better mental models for classifying and discussing new techniques and capabilities. Awareness and education of these vulnerabilities facilitate the development of more resilient AI systems and more informed standards and policies.
You can review the entire NIST Adversarial Machine Learning Taxonomy and learn more with a complete glossary of key terminology in the full paper.
We’d love to hear what you think. Ask a Question, Comment Below, and Stay Connected with Cisco Secure on social!
Cisco Security Social Channels
Instagram
Facebook
Twitter
LinkedIn
Share:
Leave a Reply