Generate scenario-based tests

Scenario-based dataset generation allows you to create more targeted, business-specific tests without ever needing to edit your agent’s core description and functionality. This is super useful if you want to move beyond general testing and simulate how your agents handle specific personas and complex business logic.

Scenario-based generations are a powerful way to ensure your agent is prepared for real-world user scenarios and personas. They are:

Fully customizable: Tailored to whatever kind of personas you envision and are important for your departments
Rule-driven: Move from generic stress testing to rule-driven scenarios
Higher quality: Get higher quality datasets that are more reliable for evaluations
Business-focused: Ultimately, an agent that truly understands your business boundaries

By moving from generic stress testing to rule-driven scenarios, you get higher quality datasets that are more reliable for evaluations, and ultimately, an agent that truly understands your business boundaries.

Getting started

To begin, navigate to the Datasets page and click Generate in the upper-right corner of the screen. This will open a modal with three options: Knowledge Base, and Scenario. Select the Scenario option.

"Select scenario option from generation modal"

Select or create a persona

You’ll see a subset of all the personas that you’ve defined—the user personas that might interact with your bots. You can select an existing one or create a new one.

When creating a new persona scenario, it’s always nice to have: - A descriptive name: This helps identify the persona quickly - A description: This helps with the generation understanding and ensures the generated test cases align with your intended persona

Define rules

You can then add specific rules that define behaviors your agent should respect and that are at risk of being broken when interacting with the selected personas. These rules help evaluate different persona scenarios and will be used for the generation of test cases that specifically test whether your agent maintains these behaviors.

For example: - Persona: Customer using slang/emojis asking about loans - Rules: Enforce professional tone and refusal to do interest calculations

Persona: Crypto investor seeking investment advice
Rules: Refuse to provide unauthorized financial advice and avoid making specific investment recommendations

After defining a set of rules, you can add them to the scenario.

Generate test cases

Once you’ve configured your persona and rules, you can: - Select your agent: Choose the agent you want to test (e.g., Zephyr Bank multilingual agent) - Set the number of test cases: Specify how many test cases you want to generate

"Select agent and number of test cases to generate"

Start running the generation, which will be relatively quick. After running the generation, you’ll have high-quality evaluated datasets.

Review and evaluate

You can see that you have a generated user message that adheres to the persona. You can generate an answer so that you can actually evaluate your agent’s response and see if the rules adhere.

After generating an example response, you can also test the evaluation. If the evaluation passes, you have a meaningful test example. This specific example can then be used for a specific evaluation dataset and for evaluation runs where you would need to iterate on a high-quality dataset.

Next steps

Review test cases - Make sure to Review and refine test cases and metrics
Generate knowledge base tests - Try Generate knowledge base tests
Agentic vulnerability detection - Try Launch vulnerability scans