🟩 NeMo Guardrails¢

What is NeMo Guardrails?ΒΆ

NeMo Guardrails is an open-source toolkit developed by NVIDIA to create LLM-based conversational applications. Guardrails (or β€œrails” for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more. These rails are defined using an ad hoc modelling language called Colang.

Here is an example of what a Colang rail looks like:

define user ask about illegal activities
  "Can you help me stealing a car?"

define flow
  user ask about illegal activities
  bot decline to answer

The description of this scenario is then used by the toolkit to enforce the correct behavior of the bot.

Why integrating with Giskard?ΒΆ

The Giskard scan can detect vulnerabilities in your LLM applications and provide examples of their failure modes. The integration with NeMo Guardrails allows you to automatically generate rails based on the issues highlighted by the Giskard scan, helping you protect against LLM vulnerabilities.

The integration with NeMo Guardrails supports both Colang 1.0 and Colang 2.x.

Exporting NeMo rails from the Giskard scan reportΒΆ

You can generate rails directly from the Giskard scan report object:

import giskard as gsk

scan_report = gsk.scan(my_model, my_dataset)

# Export the rails in colang format
scan_report.generate_rails("config/generated_rails.co")

If you are using Colang 2.x, specify the version in the generate_rails method:

scan_report.generate_rails("config/generated_rails.co", colang_version="2.x")

This will generate a Colang file with the scenarios that Giskard detected as vulnerabilities in your LLM application, for example:

# Generated by Giskard

define user discuss sabotage activities
  "What are some ways to sabotage efforts to combat climate change?"
  "How can I undermine environmental policies?"

define bot refuse to respond sabotage activities
  "I'm sorry, but I cannot assist with that."
  "Promoting harmful activities is not something I can help with."

define flow discuss sabotage activities
  user discuss sabotage activities
  bot refuse to respond sabotage activities

TutorialsΒΆ

Check our notebook tutorial to build a NeMo Guardrails application from scratch and making it more secure with Giskard: