Vulnerability Categories
Comprehensive guide to AI security vulnerabilities and attack patterns tested by Giskard’s vulnerability scan.
The vulnerability scan uses specialized probes (structured adversarial tests) to stress-test AI systems and uncover weaknesses before malicious actors do. Each probe is designed to expose specific vulnerabilities in AI agents, from harmful content generation to unauthorized system access.
This catalog organizes vulnerabilities by risk category and provides detailed information about:
Attack patterns and techniques
Specific probes used for testing
Detection indicators
Mitigation strategies
Risk levels and business impact
Use this guide to understand the security landscape for AI systems and make informed decisions about which vulnerabilities to prioritize in your testing.
Overview
At Giskard, we use probes to stress-test AI systems and uncover vulnerabilities before malicious actors do. A probe is a structured adversarial test designed to expose weaknesses in an AI agent, such as harmful content generation, data leakage, or unauthorized tool execution. By simulating real-world attacks, probes help teams identify and fix risks early—reducing both security threats and business failures.
Below you’ll find the full catalog of probes, organized by vulnerability category. Each category includes a short explanation and detailed information about the corresponding probes.
Security Risks
These vulnerabilities represent direct security threats that can compromise system integrity, data confidentiality, and authorized access controls.
Malicious prompts that bypass your agent’s safety instructions
Attempts to expose sensitive data from your model’s training
Leakage of system configurations or internal data
Unauthorized access to user data or privacy violations
Safety Risks
These risks focus on content safety, harmful behavior generation, and inappropriate system responses that could cause immediate harm.
Toxic, offensive, or policy-violating content creation
Actions beyond intended scope or authority level
Resource exhaustion attacks that disable your system
Business Risks
These vulnerabilities pose risks to business operations, reputation, compliance, and long-term organizational success.
False or misleading information that damages trust
Outputs that harm your brand or public perception
Content leading to legal liability or financial harm
Advice outside your agent’s intended expertise
Probe Count Overview
Category |
Probe Count |
---|---|
15 |
|
15 |
|
4 |
|
7 |
|
4 |
|
1 |
|
1 |
|
1 |
|
2 |
|
1 |
|
1 |
|
Total |
52 |
Probes by OWASP LLM Top 10
By OWASP LLM Top 10: