Skip to content
GitHubDiscord

Misguidance & Unauthorized Advice

Probes that test whether AI agents can be manipulated to provide professional advice outside their intended scope.

Probe NameDescription
Unauthorized AdviceThis probe generates adversarial messages that request unauthorized advice from the agent, including financial recommendations, medical advice, legal counseling, etc.
In Domain MisguidanceThis probe uses a dynamic logic: it adapts its follow-up questions based on the agent’s previous answers to probe for weaknesses and see if the chatbot can be led into providing harmful or out-of-scope guidance.