RAG Evaluation Toolkit on a Banking Supervisory Process AgentΒΆ
Install dependencies and download the Banking Supervision reportΒΆ
[ ]:
!pip install "giskard[llm]" --upgrade
!pip install llama-index PyMuPDF
[ ]:
!wget "https://www.bankingsupervision.europa.eu/ecb/pub/pdf/ssm.supervisory_guides202401_manual.en.pdf" -O "banking_supervision_report.pdf"
Build RAG Agent on the Banking Supervision reportΒΆ
[1]:
import pandas as pd
import warnings
pd.set_option("display.max_colwidth", 400)
warnings.filterwarnings('ignore')
[2]:
from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.readers.file import PyMuPDFReader
from llama_index.core.base.llms.types import ChatMessage, MessageRole
loader = PyMuPDFReader()
documents = loader.load(file_path="./banking_supervision_report.pdf")
[5]:
splitter = SentenceSplitter(chunk_size=512)
index = VectorStoreIndex.from_documents(documents, transformations=[splitter])
chat_engine = index.as_chat_engine()
Letβs test the AgentΒΆ
[6]:
str(chat_engine.chat("What is SSM?"))
[6]:
'SSM stands for Single Supervisory Mechanism.'
Generate a test set on the Banking Supervision reportΒΆ
[7]:
from giskard.rag import KnowledgeBase, generate_testset, QATestset
text_nodes = splitter(documents)
knowledge_base_df = pd.DataFrame([node.text for node in text_nodes], columns=["text"])
knowledge_base = KnowledgeBase(knowledge_base_df)
[ ]:
testset = generate_testset(knowledge_base,
num_questions=100,
agent_description="A chatbot answering questions about banking supervision procedures and methodologies.",
language="en")
[ ]:
# Save the testset
testset.save("banking_supervision_testset.jsonl")
# Load the testset
testset = QATestset.load("banking_supervision_testset.jsonl")
[9]:
testset.to_pandas().head(5)
[9]:
question | reference_answer | reference_context | conversation_history | metadata | |
---|---|---|---|---|---|
id | |||||
68320700-52c1-4490-b85b-0fb459ad7e96 | What must a significant institution obtain before including interim or year-end profits in CET1 capital? | A significant institution must obtain prior approval before including interim or year-end profits in its CET1 capital. | Document 162: If necessary, \nadditional information is requested. Next, the JST assesses whether the relevant \nregulations are complied with and establishes whether the capital instrument is listed \nin the EBAβs public list of CET1 instruments. If the instrument is not in the EBAβs \npublic list, before adopting any decision, the ECB consults the EBA. \nSubsequent issuances of instruments f... | [] | {'question_type': 'simple', 'seed_document_id': 162, 'topic': 'ECB Banking Supervision'} |
fbd171ae-5438-4080-8910-a6b55af63d3a | What is the process for an SI wishing to establish a branch in a non-participating Member State? | An SI wishing to establish a branch in a non-participating Member State has to notify the relevant NCA of its intention. Upon receipt of this notification, the NCA informs the ECB, which exercises the powers of the competent authority of the home Member State. The JST assesses whether the requirements for establishing a branch are met. If the requirements are met, the JST prepares a Supervisor... | Document 99: Supervisory Manual β Supervision of all supervised entities \n \n59 \nprovide services can be exercised, subject to national law and in the interests of the \ngeneral good. The ECB carries out the tasks of the competent authority of the host \nMember State for institutions established in non-participating Member States which \nexercise the freedom to provide services in participat... | [] | {'question_type': 'simple', 'seed_document_id': 99, 'topic': 'Banking Supervision and Passporting'} |
4f6d0dc5-4aed-4c78-ba40-c3e2422cbae9 | What are the main channels of political accountability for the ECB? | The main channels of political accountability for the ECB include: 1. The Chair of the Supervisory Board attending regular hearings and ad hoc exchanges of views in the European Parliament and the Eurogroup. National parliaments can also invite the Chair or another member of the Supervisory Board, along with a representative from the respective NCA. 2. The ECB provides the European Parliamentβ... | Document 8: There are several main channels of \npolitical accountability for the ECB: \n1. \nThe Chair of the Supervisory Board attends regular hearings and ad hoc \nexchanges of views in the European Parliament and the Eurogroup. National \nparliaments can also invite the Chair or another member of the Supervisory \nBoard, along with a representative from the respective NCA. \n2. \nThe ECB p... | [] | {'question_type': 'simple', 'seed_document_id': 8, 'topic': 'Others'} |
9367137e-42a0-4052-8700-5126ef4c4b1e | What are the responsibilities of the Joint Supervisory Teams (JSTs) in day-to-day supervision? | The day-to-day supervision of SIs is primarily conducted off-site by the JSTs, which comprise staff from NCAs and the ECB and are supported by the horizontal and specialised expertise divisions of DG/HOL and similar staff at the NCAs. The JST analyses the supervisory reporting, financial statements and internal documentation of supervised entities. They hold regular and ad hoc meetings with th... | Document 75: Supervisory Manual β Supervisory cycle \n \n45 \nThe SREP is applied proportionately to both SIs and LSIs, ensuring that the highest \nand most consistent supervisory standards are upheld. \nIn addition to ongoing activities, the ECB takes ad hoc supervisory actions through \nthe Directorate General SSM Governance & Operations (DG/SGO); the \nAuthorisation Division of DG/SGO grant... | [] | {'question_type': 'simple', 'seed_document_id': 75, 'topic': 'European Banking Supervision'} |
0b3f0ff0-9c5c-4704-a884-64e7d720953f | What role does the ECB play in the supervision of significant institutions? | The ECB acts as a gatekeeper to ensure that significant supervised entities comply with robust governance arrangements, including fit and proper requirements for management. It also has direct authority to exercise supervisory powers regarding the approval of key function holders and branch managers in significant institutions under national law. | Document 128: Supervisory Manual β Supervision of significant institutions \n \n75 \nThe responsibility of the ECB is to act as a gatekeeper. Its task is to ensure that \nsignificant supervised entities comply with the requirements to have in place robust \ngovernance arrangements, including the fit and proper requirements for the persons \nresponsible for the management of institutions. The E... | [] | {'question_type': 'simple', 'seed_document_id': 128, 'topic': 'Others'} |
Evaluate and Diagnose the AgentΒΆ
[13]:
from giskard.rag import evaluate, RAGReport
from giskard.rag.metrics.ragas_metrics import ragas_context_recall, ragas_context_precision
[ ]:
def answer_fn(question, history=None):
if history:
answer = chat_engine.chat(question, chat_history=[ChatMessage(role=MessageRole.USER if msg["role"] =="user" else MessageRole.ASSISTANT,
content=msg["content"]) for msg in history])
else:
answer = chat_engine.chat(question, chat_history=[])
return str(answer)
report = evaluate(answer_fn,
testset=testset,
knowledge_base=knowledge_base,
metrics=[ragas_context_recall, ragas_context_precision])
[19]:
# Save the report
report.save("banking_supervision_report")
# Load the report
report = RAGReport.load("banking_supervision_report")
[15]:
display(report.to_html(embed=True))
RAGET question typesΒΆ
Each question type assesses a few RAG components. This makes it possible to localize weaknesses in the RAG Agent and give feedback to the developers.
Question type |
Description |
Example |
Targeted RAG components |
---|---|---|---|
Simple |
Simple questions generated from an excerpt of the knowledge base |
What is the purpose of the holistic approach in the SREP? |
|
Complex |
Questions made more complex by paraphrasing |
In what capacity and with what frequency do NCAs contribute to the formulation and scheduling of supervisory activities, especially concerning the organization of on-site missions? |
|
Distracting |
Questions made to confuse the retrieval part of the RAG with a distracting element from the knowledge base but irrelevant to the question |
Under what conditions does the ECB levy fees to cover the costs of its supervisory tasks, particularly in the context of financial conglomerates requiring cross-sector supervision? |
|
Situational |
Questions including user context to evaluate the ability of the generation to produce relevant answer according to the context |
As a bank manager looking to understand the appeal process for a regulatory decision made by the ECB, could you explain what role the ABoR plays in the supervisory decision review process? |
|
Double |
Questions with two distinct parts to evaluate the capabilities of the query rewriter of the RAG |
What role does the SSM Secretariat Division play in the decision-making process of the ECBβs supervisory tasks, and which directorates general are involved in the preparation of draft decisions for supervised entities in the ECB Banking Supervision? |
|
Conversational |
Questions made as part of a conversation, first message describe the context of the question that is ask in the last message, also tests the rewriter |
What are these sources? |
|