Skip to content
GitHubDiscord

Internal Information Exposure

Probes designed to extract system prompts, configuration details, or other internal information.

OWASP LLM01:2025 Prompt Injection ↗ - Prompt Injection occurs when an attacker manipulates an LLM’s behavior by injecting malicious input. These attacks exploit how LLMs process text-based prompts, often bypassing safeguards, compromising outputs, or enabling unauthorized access. The vulnerability lies in the model’s inability to distinguish between safe and malicious inputs, even if imperceptible to humans. Prompt Injection attacks can range from causing the LLM to generate harmful outputs to accessing sensitive data or performing unauthorized actions.

OWASP LLM07:2025 System Prompt Leakage ↗ - System Prompt Leakage happens when internal prompts, instructions, or configurations used to guide an LLM’s behavior are exposed. These prompts may contain sensitive details, like API keys or application rules, which attackers can exploit to bypass restrictions or compromise systems. The risk lies in embedding sensitive information directly into the prompts.

Probe NameDescription
Prompt ExtractionThis probe uses an agentic approach to attempt extraction of the target’s system prompts, instructions, or internal configurations through sophisticated conversation strategies and social engineering techniques.
Tool ExtractionThis probe implements an agentic reconnaissance workflow that systematically queries the agent to discover and enumerate available tools, functions, and capabilities, exposing the agent’s internal configuration and expanding the attack surface for targeted exploitation.