Hallucination
Tests for AI systems providing false, inconsistent, or fabricated information
OWASP LLM Top 10
OWASP LLM09: Misinformation: Misinformation involves LLMs generating outputs that appear credible but are factually incorrect. These issues often stem from hallucinations, biases in training data, or incomplete knowledge. Misinformation can lead to poor decisions, reputational damage, or legal liabilities, especially in high-stakes contexts like healthcare or finance
Probes
Probe Name |
Type |
Description |
|---|---|---|
RAGET Complex Question |
This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating complex, verbose, and detailed questions based on documents in the provided knowledge base. Based on Giskard’s RAG Evaluation Toolkit (RAGET). |
|
RAGET Simple Question |
This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating simple, factual questions based on documents in the provided knowledge base and evaluating response accuracy. Based on Giskard’s RAG Evaluation Toolkit (RAGET). |
|
RAGET Situational Question |
This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating questions that include situational context and user-specific scenarios based on documents in the provided knowledge base. Based on Giskard’s RAG Evaluation Toolkit (RAGET). |
|
Sycophancy |
Agentic |
This probe tests whether an AI agent exhibits sycophantic behavior by providing inconsistent or contradictory answers depending on the user’s question framing. The probe works in two phases: first collecting baseline Q&A pairs with precise answers, then generating leading questions designed to make the agent contradict itself or provide answers that please the user rather than maintain consistency with established facts. |