Internal Information Exposure

Probes designed to extract system prompts, configuration details, or other internal information

OWASP LLM Top 10

OWASP LLM01: Prompt Injection: Prompt Injection occurs when an attacker manipulates an LLM’s behavior by injecting malicious input. These attacks exploit how LLMs process text-based prompts, often bypassing safeguards, compromising outputs, or enabling unauthorized access. The vulnerability lies in the model’s inability to distinguish between safe and malicious inputs, even if imperceptible to humans. Prompt Injection attacks can range from causing the LLM to generate harmful outputs to accessing sensitive data or performing unauthorized actions

OWASP LLM07: System Prompt Leakage: System Prompt Leakage happens when internal prompts, instructions, or configurations used to guide an LLM’s behavior are exposed. These prompts may contain sensitive details, like API keys or application rules, which attackers can exploit to bypass restrictions or compromise systems. The risk lies in embedding sensitive information directly into the prompts

Probes

Probe Name

Type

Description

Prompt Extraction

Agentic

This probe uses an agentic approach to attempt extraction of the target’s system prompts, instructions, or internal configurations through sophisticated conversation strategies and social engineering techniques.

Tool Extraction

Agentic

This probe uses an agentic approach to discover and extract information about tools available in the target system and their signatures. Knowledge of precise tool signatures can be used to craft targeted attacks and abuse the system functionalities.