2.0.0 (2025-09-25)

We’re releasing an upgraded LLM vulnerability scanner in Giskard Hub, specifically designed to secure conversational AI agents in production environments. This enterprise version deploys autonomous red teaming agents that conduct dynamic, multi-turn attacks across dozens of vulnerability categories covering more than 40 probes.

What’s new?

Comprehensive LLM Vulnerabilities Coverage

The scanner covers LLM vulnerabilities across established OWASP categories and business failures:

Prompt Injection (OWASP LLM 01) - Attacks that manipulate AI agents through carefully crafted prompts
Training Data Extraction (OWASP LLM 02) - Attempts to extract or infer information from the AI model’s training data
Data Privacy Exfiltration (OWASP LLM 05) - Attacks aimed at extracting sensitive information
Excessive Agency (OWASP LLM 06) - Tests whether AI agents can be manipulated beyond their intended scope
Hallucination & Misinformation (OWASP LLM 08) - Tests for false, inconsistent, or fabricated information
Denial of Service (OWASP LLM 10) - Attacks that attempt to cause resource exhaustion
Internal Information Exposure - Attempts to extract system prompts and configuration details
Harmful Content Generation - Probes that bypass safety measures
Brand Damage & Reputation - Tests for reputational risks
Legal & Financial Risk - Attacks exposing deployers to liabilities
Unauthorized Professional Advice - Tests for advice outside intended scope

Business Alignment

Evaluates both security vulnerabilities and business failures, automatically validating business logic by generating expected outputs from knowledge bases.

Domain-specific Attacks

Adapts testing methodologies to agent-specific contexts using bot descriptions, tools specification, and knowledge bases for realistic evaluation.

Multi-turn Attack Simulation

Implements dynamic multi-turn testing that simulates realistic conversation flows, detecting context-dependent vulnerabilities that emerge through conversation history.

Adaptive AI Red Teaming

Adjusts attack strategies based on agent resistance, escalating tactics or pivoting approaches when encountering defenses.

Root-cause Analysis

Every detected vulnerability includes detailed explanations of attack methodology and severity scoring for prioritized remediation.

Continuous Red Teaming

Detected vulnerabilities automatically convert into reusable tests for continuous validation and integration into golden datasets.

What’s changed?

Removed support for importing and exporting knowledge bases (KB) in CSV format. Only JSON and JSONL formats are now supported for KB import/export.
In the client library version 2.0.0, legacy functions have been deprecated and removed. Notably, the previous ‘conversations’ functionality has been replaced by ‘chat_test_cases’ to improve clarity and consistency across the product.

What’s fixed?

Fixed an issue with document embedding when handling a single large document (#86).
Resolved a bug related to access of notification preferences, ensuring all users have appropriate access regardless of their permissions (#87).
Corrected a problem where new environment creation did not set the Keycloak secret correctly.
Fixed mismatches between displayed statistics and actual items in evaluation lists.
Addressed a bug affecting failure category editing.
Fixed incorrect styling on the “move conversation” button.
Resolved issues with failure categories not functioning properly when using a local model.

How to get started?

Configure vulnerability scope - Select specific vulnerability categories relevant to your use case
Execute the scan - The system runs hundreds of probes across security and business logic areas
Analyze results by severity - Results are organized by criticality for prioritized review
Review individual probes - Each probe provides detailed attack descriptions and explanations
Turn into continuous tests - Successful probes can convert into tests for continuous validation

This release enables detection of sophisticated attacks that evolve across multiple conversation turns, automatically generating attacks, analyzing system responses, and modifying approaches to help correct agents with re-executable tests.