Attack categories

Comprehensive guide to AI security vulnerabilities and attack patterns tested by Giskard’s vulnerability scan.

The vulnerability scan uses specialized probes (structured adversarial tests) to stress-test AI systems and uncover weaknesses before malicious actors do. Each probe is designed to expose specific vulnerabilities in AI agents, from harmful content generation to unauthorized system access.

This catalog organizes vulnerabilities by risk category and provides detailed information about:

Attack patterns and techniques
Specific probes used for testing
Detection indicators
Mitigation strategies
Risk levels and business impact

Use this guide to understand the security landscape for AI systems and make informed decisions about which vulnerabilities to prioritize in your testing.

Overview

At Giskard, we use probes to stress-test AI systems and uncover vulnerabilities before malicious actors do. A probe is a structured adversarial test designed to expose weaknesses in an AI agent, such as harmful content generation, data leakage, or unauthorized tool execution. By simulating real-world attacks, probes help teams identify and fix risks early—reducing both security threats and business failures.

Below you’ll find the full catalog of probes, organized by vulnerability category. Each category includes a short explanation and detailed information about the corresponding probes.

Security Risks

These vulnerabilities represent direct security threats that can compromise system integrity, data confidentiality, and authorized access controls.

🔓 Prompt Injection

Malicious prompts that bypass your agent’s safety instructions

Prompt Injection (OWASP LLM 01)

📊 Training Data Extraction

Attempts to expose sensitive data from your model’s training

Training Data Extraction (OWASP LLM 02)

🔍 Internal Information Exposure

Leakage of system configurations or internal data

Internal Information Exposure (OWASP LLM 01-07)

🛡️ Data Privacy & Exfiltration

Unauthorized access to user data or privacy violations

Data Privacy & Exfiltration (OWASP LLM 05)

Safety Risks

These risks focus on content safety, harmful behavior generation, and inappropriate system responses that could cause immediate harm.

⚠️ Harmful Content Generation

Toxic, offensive, or policy-violating content creation

Harmful Content Generation

🚫 Excessive Agency

Actions beyond intended scope or authority level

Excessive Agency (OWASP LLM 06)

💥 Denial of Service

Resource exhaustion attacks that disable your system

Denial of Service (OWASP LLM 10)

Business Risks

These vulnerabilities pose risks to business operations, reputation, compliance, and long-term organizational success.

🤔 Hallucination & Misinformation

False or misleading information that damages trust

Hallucination & Misinformation (OWASP LLM 09)

📉 Brand Damage & Reputation

Outputs that harm your brand or public perception

Brand Damage & Reputation

⚖️ Legal & Financial Risk

Content leading to legal liability or financial harm

Legal & Financial Risk

💼 Misguidance & Unauthorized Advice

Advice outside your agent’s intended expertise

Misguidance & Unauthorized Advice

Probe Count Overview

Probe Distribution by Category
Category	Probe Count
Prompt Injection	15
Harmful Content Generation	16
Hallucination & Misinformation	4
Excessive Agency	7
Data Privacy Exfiltration	4
Internal Information Exposure	2
Training Data Extraction	1
Denial of Service	1
Brand Damage & Reputation	2
Legal & Financial Risk	1
Misguidance & Unauthorized Advice	1
Total	54

Probes by OWASP LLM Top 10

By OWASP LLM Top 10: