Open In Colab View Notebook on GitHub

LLM product description from keywords

Giskard is an open-source framework for testing all ML models, from LLMs to tabular models. Don’t hesitate to give the project a star on GitHub ⭐️ if you find it useful!

In this notebook, you’ll learn how to create comprehensive test suites for your model in a few lines of code, thanks to Giskard’s open-source Python library.

In this example, we illustrate the procedure using OpenAI Client that is the default one; however, please note that our platform supports a variety of language models. For details on configuring different models, visit our 🤖 Setting up the LLM Client page

In this tutorial we will walk through a practical use case of using the Giskard LLM Scan on a Prompt Chaining task, one step at a time. Given a product name, we will ask the LLM to process 2 chained prompts using langchain in order to provide us with a product description. The 2 prompts can be described as follows:

  1. keywords_prompt_template: Based on the product name (given by the user), the LLM has to provide a list of five to ten relevant keywords that would increase product visibility.

  2. product_prompt_template: Based on the given keywords (given as a response to the first prompt), the LLM has to generate a multi-paragraph rich text product description with emojis that is creative and SEO compliant.


  • Two-step product description generation. 1) Keywords generation -> 2) Description generation;

  • Foundational model: gpt-3.5-turbo


  • Detect vulnerabilities automatically with Giskard’s scan

  • Automatically generate & curate a comprehensive test suite to test your model beyond accuracy-related metrics

Install dependencies

Make sure to install the giskard[llm] flavor of Giskard, which includes support for LLM models.

[ ]:
%pip install "giskard[llm]" --upgrade

Import libraries

import os

import openai
import pandas as pd
from langchain.chains import LLMChain, SequentialChain
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

from giskard import Dataset, Model, scan

Notebook settings

[ ]:
# Set the OpenAI API Key environment variable.
openai.api_key = OPENAI_API_KEY

# Display options.
pd.set_option("display.max_colwidth", None)

Define constants

LLM_MODEL = "gpt-3.5-turbo"

TEXT_COLUMN_NAME = "product_name"

# First prompt to generate keywords related to the product name
KEYWORDS_PROMPT_TEMPLATE = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful assistant that generate a CSV list of keywords related to a product name

    Example Format:
    PRODUCT NAME: product name here
    KEYWORDS: keywords separated by commas here

    Generate five to ten keywords that would increase product visibility. Begin!

    ("human", """
    PRODUCT NAME: {product_name}

# Second chained prompt to generate a description based on the given keywords from the first prompt
PRODUCT_PROMPT_TEMPLATE = ChatPromptTemplate.from_messages([
    ("system", """As a Product Description Generator, generate a multi paragraph rich text product description with emojis based on the information provided in the product name and keywords separated by commas.

    Example Format:
    PRODUCT NAME: product name here
    KEYWORDS: keywords separated by commas here
    PRODUCT DESCRIPTION: product description here

    Generate a product description that is creative and SEO compliant. Emojis should be added to make product description look appealing. Begin!

    ("human", """
    PRODUCT NAME: {product_name}
    KEYWORDS: {keywords}

Model building

Create a model with LangChain

Using the prompt templates defined earlier we can create two LLMChain and concatenate them into a SequentialChain that takes as input the product name, and outputs a product description

def generation_function(df: pd.DataFrame):
    llm = ChatOpenAI(temperature=0.2, model=LLM_MODEL)

    # Define the chains.
    keywords_chain = LLMChain(llm=llm, prompt=KEYWORDS_PROMPT_TEMPLATE, output_key="keywords")
    product_chain = LLMChain(llm=llm, prompt=PRODUCT_PROMPT_TEMPLATE, output_key="description")

    # Concatenate both chains.
    product_description_chain = SequentialChain(chains=[keywords_chain, product_chain],

    return [product_description_chain.invoke(product_name) for product_name in df['product_name']]

Detect vulnerabilities in your model

Wrap model and dataset with Giskard

Before running the automatic LLM scan, we need to wrap our model into Giskard’s Model object. We can also optionally create a small dataset of queries to test that the model wrapping worked.

[ ]:
# Wrap the description chain.
giskard_model = Model(
    # A prediction function that encapsulates all the data pre-processing steps and that could be executed with the dataset
    model_type="text_generation",  # Either regression, classification or text_generation.
    name="Product keywords and description generator",  # Optional.
    description="Generate product description based on a product's name and the associated keywords."
                "Description should be using emojis and being SEO compliant.",  # Is used to generate prompts
    feature_names=['product_name']  # Default: all columns of your dataset.

# Optional: Wrap a dataframe of sample input prompts to validate the model wrapping and to narrow specific tests' queries.
corpus = [
    "Double-Sided Cooking Pan",
    "Automatic Plant Watering System",
    "Miniature Exercise Equipment"

giskard_dataset = Dataset(pd.DataFrame({TEXT_COLUMN_NAME: corpus}), target=None)

Let’s check that the model is correctly wrapped by running it:

[ ]:
# Validate the wrapped model and dataset.

Scan your model for vulnerabilities with Giskard

We can now run Giskard’s scan to generate an automatic report about the model vulnerabilities. This will thoroughly test different classes of model vulnerabilities, such as harmfulness, hallucination, prompt injection, etc.

The scan will use a mixture of tests from predefined set of examples, heuristics, and LLM based generations and evaluations.

Since running the whole scan can take a bit of time, let’s start by limiting the analysis to the hallucination category:

[ ]:
results = scan(giskard_model)