W&B Example - LLM¶

In this tutorial we will walk through a practical use case of using the Giskard LLM Scan and the W&B tracer on a Prompt Chaining task, one step at a time.

In the following example, we illustrate the procedure using OpenAI Client that is the default one; however, please note that our platform supports a variety of language models. For details on configuring different models, visit our 🤖 Setting up the LLM Client page

Prerequisites 🔧¶

To begin, it is essential to:

have a Python version between 3.8 and 3.11 and the following PyPI packages:
- wandb (for more installation instructions, read this page)
- giskard[llm] (for more installation instructions, read this page)
sign up for a Weights and Biases account here.

[ ]:

%pip install "wandb<=0.15.12" "giskard[llm]" "langchain<=0.0.301" "openai<=0.28.1" -q

[ ]:

import wandb

wandb.login(key="key to retrieve from https://wandb.ai/authorize")

Configurations 🔩¶

Next, let’s configure 3 environment variables:

OPENAI_API_KEY: where you would provide your own OpenAI ChatGPT API key (More instructions here).
LANGCHAIN_WANDB_TRACING: the only variable you need to set true in order to track a langchain model with W&B.
WANDB_PROJECT: the name of the project where the tracing will be saved on W&B.

[ ]:

import os

# Setting up OpenAI API KEY
os.environ['OPENAI_API_KEY'] = "sk-xxx"

# Enabling the W&B tracing
os.environ["LANGCHAIN_WANDB_TRACING"] = "true"

# Picking up a name for the project
os.environ["WANDB_PROJECT"] = "product_description"

Introduction to the use-case ✍️¶

Many of the World’s most successful companies (see this blog) are leveraging machine learning and AI technology as an integral part of their marketing strategy.

In the world of advertising, efficient product marketing goes beyond just grabbing attention. It’s about offering comprehensive product descriptions to enhance visibility, attract quality leads, and build a strong brand image. Yet, manually crafting these descriptions can be time-consuming and repetitive.

In the following, we will show with a basic example how this process can be simplified. Given a product name, we will ask the LLM to process 2 chained prompts using langchain in order to provide us with a product description. The 2 prompts can be described as follows:

keywords_prompt_template: Based on the product name (given by the user), the LLM has to provide a list of five to ten relevant keywords that would increase product visibility.
product_prompt_template: Based on the given keywords (given as a response to the first prompt), the LLM has to generate a multi-paragraph rich text product description with emojis that is creative and SEO compliant.

[ ]:

from langchain.prompts import ChatPromptTemplate

# First prompt to generate keywords related to the product name
keywords_prompt_template = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful assistant that generate a CSV list of keywords related to a product name

    Example Format:
    PRODUCT NAME: product name here
    KEYWORDS: keywords separated by commas here

    Generate five to ten keywords that would increase product visibility. Begin!

    """),
    ("human", """
    PRODUCT NAME: {product_name}
    KEYWORDS:""")])

# Second chained prompt to generate a description based on the given keywords from the first prompt
product_prompt_template = ChatPromptTemplate.from_messages([
    ("system", """As a Product Description Generator, generate a multi paragraph rich text product description with emojis based on the information provided in the product name and keywords separated by commas.

    Example Format:
    PRODUCT NAME: product name here
    KEYWORDS: keywords separated by commas here
    PRODUCT DESCRIPTION: product description here

    Generate a product description that is creative and SEO compliant. Emojis should be added to make product description look appealing. Begin!

    """),
    ("human", """
    PRODUCT NAME: {product_name}
    KEYWORDS: {keywords}
    PRODUCT DESCRIPTION:
        """)])

Initialization of the LLMs ⛓️¶

We can now create the two langchain models powered by gpt-3.5-turbo and gpt-4 . In order to facilitate the organization and retrieval of the different models and results, we will create a small dictionary that will contain for each foundational model:

langchain: the langchain model.
giskard: the giskard wrapper that will eventually be used by the scan.
scan_report: the report resulting from running the scan.
test_suite: the test suite and metrics generated by the scan.

[ ]:

models = {"gpt-3.5-turbo": {"langchain": None, "giskard": None, "scan_report": None, "test_suite": None},
          "gpt-4": {"langchain": None, "giskard": None, "scan_report": None, "test_suite": None}, }

Using the prompt templates defined earlier we can create two LLMChain and concatenate them into a SequentialChain that takes as input the product name, and outputs a product description

[ ]:

from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain
from langchain.chains import SequentialChain

for model in models.keys():
    # langchain model powered by ChatGPT
    llm = ChatOpenAI(temperature=0.2, model=model)

    # Defining the chains
    keywords_chain = LLMChain(llm=llm, prompt=keywords_prompt_template, output_key="keywords")
    product_chain = LLMChain(llm=llm, prompt=product_prompt_template, output_key="description")

    # Concatenation of both chains
    models[model]["langchain"] = SequentialChain(chains=[keywords_chain, product_chain],
                                                 input_variables=["product_name"],
                                                 output_variables=["description"])

Wrapping of the LLMs with Giskard 🎁¶

In order to perform the scan, we will wrap the previously defined langchain models with giskard.Model API that takes 4 important arguments:

model: the model that we would like to wrap, in this case models[model]["langchain"] defined above.
model_type: the type of the model in question in order to communicate it to giskard.
description: the description of the model’s task. This is very important as it will be used to generate internal prompts and evaluation strategies to scan the model.
feature_names: the name of the model’s input.

[ ]:

import giskard

for model in models.keys():
    models[model]["giskard"] = giskard.Model(models[model]["langchain"],
                                             name="Product keywords and description generator",
                                             model_type="text_generation",
                                             description="Generate product description based on a product's name and the associated keywords. "
                                                         "Description should be using emojis and being SEO compliant.",
                                             feature_names=['product_name'])

We will also wrap a small dataset that will be used during the scan. This is an optional step, as in the absence of a dataset, the scan will automatically generate a representative one based on the description provided by the giskard.Model API.

[ ]:

import pandas as pd

pd.set_option("display.max_colwidth", 999)

dataset = giskard.Dataset(pd.DataFrame({
    'product_name': ["Double-Sided Cooking Pan",
                     "Automatic Plant Watering System",
                     "Miniature Exercise Equipment"],

}), name="Test dataset", target=None,
    column_types={"product_name": "text"})

Examples¶

[ ]:

import wandb

run = wandb.init(project=os.environ["WANDB_PROJECT"], name="examples")
predictions = models["gpt-3.5-turbo"]["giskard"].predict(dataset).prediction
for k, v in dataset.df.product_name.to_dict().items():
    os.environ["WANDB_NAME"] = "examples_" + str(k)
    print("Example #", k + 1)
    print("product_name (input):", v)
    print("product_description (output):", predictions[k])
    print("--------------------------------------------------------------------")
run.finish()

Scanning with Giskard and Logging into W&B 🔬¶

At this point, we have all the ingredients we need to perform the scan on the two models. Not only we will be finding issues automatically, we will be able to have a full trace of every prompt and response used to find them!

To run the scan, we will run over the two models, initiate a new W&B run inside our created project (in order to separate the traces), run the one-liner giskard.scan API on the wrapped model and dataset, generate a test suite and finally log the results into W&B.

[ ]:

for model in models.keys():
    # Initiate a new run with the foundational model name inside the W&B project
    run = wandb.init(project=os.environ["WANDB_PROJECT"], name=model)

    # Scan report
    # 1) Generate
    models[model]['scan_report'] = giskard.scan(models[model]['giskard'], dataset, raise_exceptions=True)
    # 2) Log into W&B
    models[model]['scan_report'].to_wandb(run)

    # Test suite
    # 1) Generate
    models[model]['test_suite'] = models[model]['scan_report'].generate_test_suite()
    # 2) Log into W&B
    models[model]['test_suite'].run().to_wandb(run)

    # End W&B run
    run.finish()

Scan results 🎉¶

In order to visualise the scan reports, you can either navigate to your https://wandb.ai/home or if you’re running in a notebook all you have to do is execute the following line in a cell:

[21]:

display(models["gpt-3.5-turbo"]['scan_report'])

[22]:

display(models["gpt-4"]['scan_report'])