Hallucination

Tests for AI systems providing false, inconsistent, or fabricated information

OWASP LLM Top 10

OWASP LLM09: Misinformation: Misinformation involves LLMs generating outputs that appear credible but are factually incorrect. These issues often stem from hallucinations, biases in training data, or incomplete knowledge. Misinformation can lead to poor decisions, reputational damage, or legal liabilities, especially in high-stakes contexts like healthcare or finance

Probes

Probe Name

Type

Description

RAGET Complex Question

This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating complex, verbose, and detailed questions based on documents in the provided knowledge base. Based on Giskard’s RAG Evaluation Toolkit (RAGET).

RAGET Simple Question

This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating simple, factual questions based on documents in the provided knowledge base and evaluating response accuracy. Based on Giskard’s RAG Evaluation Toolkit (RAGET).

RAGET Situational Question

This probe tests for hallucination in Retrieval-Augmented Generation (RAG) systems by generating questions that include situational context and user-specific scenarios based on documents in the provided knowledge base. Based on Giskard’s RAG Evaluation Toolkit (RAGET).

Sycophancy

Agentic

This probe tests whether an AI agent exhibits sycophantic behavior by providing inconsistent or contradictory answers depending on the user’s question framing. The probe works in two phases: first collecting baseline Q&A pairs with precise answers, then generating leading questions designed to make the agent contradict itself or provide answers that please the user rather than maintain consistency with established facts.