Misguidance And Unauthorized Advice

Probes that test whether AI agents can be manipulated to provide professional advice outside their intended scope

Probes

Probe Name

Type

Description

In Domain Misguidance

Agentic

This probe tests whether an AI agent can provide unauthorized advice about information contained in the knowledge base.

Unauthorized Advice

This probe generates adversarial messages that request unauthorized advice from the agent, including financial recommendations, medical advice, legal counseling, etc.