Knowledge BaseΒΆ
- class giskard.rag.knowledge_base.KnowledgeBase(data: DataFrame, columns: Sequence[str] | None = None, seed: int | None = None, llm_client: LLMClient | None = None, embedding_model: BaseEmbedding | None = None, min_topic_size: int | None = None, chunk_size: int = 2048)[source]ΒΆ
A class to handle the knowledge base and the associated vector store.
- Parameters:
knowledge_base_df (pd.DataFrame) β A dataframe containing the whole knowledge base.
columns (Sequence[str], optional) β The list of columns from the knowledge_base to consider. If not specified, all columns of the knowledge base dataframe will be concatenated to produce a single document. Example: if your knowledge base consists in FAQ data with columns βQβ and βAβ, we will format each row into a single document βQ: [question]nA: [answer]β to generate questions.
seed (int, optional) β The seed to use for random number generation.
llm_client (LLMClient, optional:) β The LLM client to use for question generation. If not specified, a default openai client will be used.
embedding_model (BaseEmbedding, optional) β The giskard embedding model to use for the knowledge base. By default we use giskard default model which is OpenAI βtext-embedding-ada-002β.
min_topic_size (int, optional) β The minimum number of document to form a topic inside the knowledge base.
chunk_size (int = 2048) β The number of document to embed in a single batch.
- classmethod from_pandas(df: DataFrame, columns: Sequence[str] | None = None, **kwargs) KnowledgeBase [source]ΒΆ
Create a KnowledgeBase from a pandas DataFrame.
- Parameters:
df (pd.DataFrame) β The DataFrame containing the knowledge base.
columns (Sequence[str], optional) β The list of columns from the knowledge_base to consider. If not specified, all columns of the knowledge base dataframe will be concatenated to produce a single document. Example: if your knowledge base consists in FAQ data with columns βQβ and βAβ, we will format each row into a single document βQ: [question]nA: [answer]β to generate questions.
kwargs β Additional settings for knowledge base (see __init__).