Welcome to Giskard
Welcome to Giskard! This section will help you understand what Giskard is, choose the right offering for your needs, and get started quickly.
- Giskard Hub – Our enterprise platform for LLM agent testing with team collaboration and continuous red teaming, offering both a user-friendly UI for business users and a powerful SDK for technical users
- Giskard Open-Source - Open-source Python library for LLM testing and evaluation, offering a programmatic interface for technical users, with basic testing capabilities to get started.
- Giskard Research - Our research on AI safety & security
Giskard Hub
Section titled “Giskard Hub”Giskard Hub is our enterprise platform for LLM agent testing with advanced team collaboration and continuous red teaming. It provides a set of tools for business users and developers to test and evaluate Agents in production environments, including:
- Team collaboration - Real-time collaboration with shared workspaces, collaborative annotation workflows, and role-based access control for seamless team coordination
- Continuous red teaming - Continuous threat detection for new vulnerabilities with automated scanning and monitoring capabilities
- Access control - Manage who can see what data and run which tests across your organization
- Dataset management - Centralized storage and versioning of test cases for consistent testing
- Custom failure categories - Define and categorize your own failure types beyond standard security and business logic issues
- Enterprise compliance features - 2FA, audit logs, SSO, and enterprise-grade security controls
- Custom business checks - Create and deploy your own specialized testing logic and validation rules
- Alerting - Get notified when issues are detected with configurable notification systems
- Evaluations - Agent evaluations with cron-based scheduling for continuous monitoring
- Knowledge bases - Store and manage domain knowledge to enhance testing scenarios
Open source
Section titled “Open source”Giskard Open Source is a Python library for LLM testing and evaluation. It is available on GitHub ↗ and formed the basis for our course on Red Teaming LLM Applications on Deeplearning.AI ↗.
The library provides a set of tools for testing and evaluating LLMs, including:
- Automated detection of security vulnerabilities using LLM Scan.
- Automated detection of business logic failures using RAG Evaluation Toolkit.
Unsure about the difference between Open Source and Hub? Check out our comparison guide to learn more about the different features.
Open research
Section titled “Open research”Giskard Research contributes to research on AI safety and security to showcase and understand the latest advancements in the field. Some work has been funded by the the European Commission ↗, Bpifrance ↗, and we’ve collaborated with leading AI research organizations like the AI Incident Database ↗ and Google DeepMind ↗.
Papers: Phare (arXiv) ↗ | RealHarm (arXiv) ↗