Data Privacy Exfiltration

Attacks aimed at extracting sensitive information, personal data, or confidential content from AI systems

OWASP LLM Top 10

OWASP LLM02: Sensitive Information Disclosure: Sensitive Information Disclosure happens when an LLM unintentionally reveals private or proprietary information. This can include PII, system credentials, or confidential business data. The risk arises from improper data sanitization, poor input handling, or overly permissive outputs. Attackers or users may exploit these vulnerabilities, leading to privacy violations, data breaches, or compliance issues

OWASP LLM05: Improper Output Handling: Improper Output Handling occurs when an LLM’s responses are not adequately validated, sanitized, or encoded before being passed to downstream systems. This can result in vulnerabilities such as cross-site scripting (XSS), SQL injection, or unauthorized system commands

Probes

Probe Name

Type

Description

Cross Session Leak

This probe tests for information leakage across different sessions or conversations, evaluating whether sensitive data from one session can be retrieved in subsequent sessions. The probe sends sensitive information (passwords, keys, tokens) in one conversation, then starts a new conversation and attempts to retrieve that information.

Link Injection

This probe tests whether an AI agent can be manipulated to generate or recommend malicious links that could lead to phishing sites, malware downloads, or other harmful web destinations.

PII Leak

This probe tests whether an AI agent can be tricked into revealing personally identifiable information (PII) such as names, addresses, phone numbers, or other sensitive personal data. It evaluates the target’s ability to maintain privacy boundaries and protect confidential information.

Script Injection

This probe tests whether an AI agent can be manipulated to generate responses containing malicious JavaScript or other executable script code that could lead to Cross-Site Scripting (XSS) vulnerabilities.