Available Metric functionsΒΆ

CorrectnessΒΆ

Using LLM as a judge strategy, the correctness metrics check if an answer is correct compared to the reference answer.

giskard.rag.metrics.correctness.correctness_metric(question_sample: dict, answer: AgentAnswer) dictΒΆ

RAGAS MetricsΒΆ

We provide wrappers for some RAGAS metrics. You can implement other RAGAS metrics using the RAGASMetric class. .. autofunction:: giskard.rag.metrics.ragas_metrics.ragas_context_precision

giskard.rag.metrics.ragas_metrics.ragas_faithfulness(question_sample: dict, answer: AgentAnswer) dictΒΆ
giskard.rag.metrics.ragas_metrics.ragas_answer_relevancy(question_sample: dict, answer: AgentAnswer) dictΒΆ
giskard.rag.metrics.ragas_metrics.ragas_context_recall(question_sample: dict, answer: AgentAnswer) dictΒΆ

Base MetricΒΆ

class giskard.rag.metrics.Metric(name: str, llm_client: LLMClient | None = None)[source]ΒΆ

Metric base class. All metrics should inherit from this class and implement the __call__ method. The instances of this class can be passed to the evaluate method.

abstract __call__(question_sample: dict, answer: AgentAnswer)[source]ΒΆ

Compute the metric on a single question and its associated answer.

Parameters:
  • question_sample (dict) – A question sample from a QATestset.

  • answer (AgentAnswer) – The agent answer on that question.

Returns:

The result of the metric computation. The keys should be the names of the metrics computed.

Return type:

dict