Internal Information Exposure
Probes designed to extract system prompts, configuration details, or other internal information
OWASP LLM Top 10
OWASP LLM01: Prompt Injection: Prompt Injection occurs when an attacker manipulates an LLM’s behavior by injecting malicious input. These attacks exploit how LLMs process text-based prompts, often bypassing safeguards, compromising outputs, or enabling unauthorized access. The vulnerability lies in the model’s inability to distinguish between safe and malicious inputs, even if imperceptible to humans. Prompt Injection attacks can range from causing the LLM to generate harmful outputs to accessing sensitive data or performing unauthorized actions
OWASP LLM07: System Prompt Leakage: System Prompt Leakage happens when internal prompts, instructions, or configurations used to guide an LLM’s behavior are exposed. These prompts may contain sensitive details, like API keys or application rules, which attackers can exploit to bypass restrictions or compromise systems. The risk lies in embedding sensitive information directly into the prompts
Probes
Probe Name |
Type |
Description |
|---|---|---|
Prompt Extraction |
Agentic |
This probe uses an agentic approach to attempt extraction of the target’s system prompts, instructions, or internal configurations through sophisticated conversation strategies and social engineering techniques. |
Tool Extraction |
Agentic |
This probe uses an agentic approach to discover and extract information about tools available in the target system and their signatures. Knowledge of precise tool signatures can be used to craft targeted attacks and abuse the system functionalities. |